Executive summary

Overview and motivation

Research goal

How do governments decide where and how to install CCTVs? The goals of this project is an investigation of i) finding the root causes and criteria which -backed up by data - determine CCTV installment in Baltimore, ii) the crime deterrent potential of CCTVs in the Amercian city of Balimore and iii) the impact on the communities themselves.

Methodology

Using six datasets, we found and combined relevant data about crimes committed in Baltimore, CCTV locations in the city, poverty rates, the are of Baltimore, the population and the households with internet access.

Main Takeaways

First, our analysis shows quite well that CCTV placement seems to follow areas where crime per capita is highest. Looking at the North-western and South-western areas of Baltimore, it can be seen that the placement of CCTVs aligns rather well with the areas considered dangerous.

Second, CCTCs capture activities within 256ft (~2 blocks). Nevertheless, we see that some crimes are committed directly in front of CCTVs. Although this is not conclusive evidence, this observation goes against the idea that CCTVs are effective crime deterrents. MOreover, for August 2021 the crime rate per area was highest in the Downtown area. Looking deeply into the Downtown area, we quite clearly see that some crimes are committed right next to some CCTVs.

Third, regarding crime types our analysis and results of a simple linear regression show a weak \(R^2\) for both felonies and misdemeanors. It therefore does not seem like the presence of CCTV has a particularly strong impact on a certain type of crime.

Fourth, regarding wealth and CCTVs our data shows a poor \(adjusted R^2\) and correlation between CCTV density and wealth. One of our initial hypothesis was that the government respected more the privacy of wealthier people, which turned out to be proven wrongly by data. However, we are not sure whether wealth is the only influential factor. Again, in the northern parts we see less CCTV, less crime, and also a more wealthier population.

Fifth, regarding the distribution and pattern of crime types in Baltimore. The idea here is to analyse whetherwe tend to observe an equal distribution of felony and misdemeanors in each area with one outlier: In Downtown, misdemeanor per capita is much larger than the felony per capita and it must be mentioned that this area also is one of the richest area in Baltimore. This suggests that richer areas are more impacted by less severe crimes.

Conclusion

Overall, we see much more through our multiple regression that the main determining and significant variable influencing the CCTV placement decisions of the government officials are the felonies committed in a community. Thus, we can trust that unfair considerations such as the wealth level of the neighborhoods, the race or educational level. In that way, the trust against government shoud and can be reduced in Baltimore.

Introduction

Overview and motivation

Video surveillance (CCTV) is a technology that is nowadays deeply woven into the everyday life of many people as one tends to expect it in many varied circumstances (Ossola, 2019). The rationale behind the installation of these systems seems to be very clear for governments. For example, on Buffalo’s (NY) open data website, one can read that “the City of Buffalo deploys a real-time, citywide video surveillance system to augment the public safety efforts of the Buffalo Police Department”. Yet, the development of this new technology, is not exempt from any controversy. For instance, many observers claim that the expansion of video surveillance poses an unregulated threat to privacy (ACLU, 2021). Still, many people seem to be willing to accept this loss in privacy as the surge in video surveillance makes them feel safer (Madden & Rainie, 2015).

Throughout this research, we challenge the widespread belief that people who have “nothing to hide” should be content with the expansion of CCTV networks as the latter makes them safer (Madden & Rainie, 2015). Indeed, on top of many privacy issues linked with this surge in video surveillance systems, one might legitimately ask the question whether these cameras actually make people safer?

The goal of this project in the first phase is to investigate the crime deterrent potential of CCTVs in an Amercian city. This potential will also be compared to the different types of crime that are committed in this area. Over a second phase, the dispersion of CCTVs within the city will be investigated. Indeed, according to some researches, mass surveillance has a stronger impact on communities already disadvantaged by their poverty, race, religion, ethnicity, or immigration status (Gellman & Adler-Bell, 2017). We would like to see whether our data enables us to validate or invalidate this theory. It would also be extremely interesting, even though challenging, to see whether the installation of surveillance systems could potentially create even more pernicious issues such as crime displacements (Waples, Gill & Fisher, 2009).

In sum we argue that, in a world where CCTVs and other surveillance systems are flourishing, it might be beneficial to take a step back and question both the efficacy and the implementation design of such technologies, since they are often portrayed by different stakeholders as miraculous solutions to very complex issues.

Backgrounds

Augustin: Augustin obtained a degree in Business Administration at the University of St-Gallen where he had the opportunity to develop a strong interest in digital business ethics. He wrote his bachelor’s thesis on the privacy implications of the use of fear appeals in home surveillance devices’ marketing strategy.

Marine: Marine made a bachelor in Law at the UBO (Université de Bretagne-Occidentale). She is presently into the Master DCS (Droit, Criminalité et Sécurité des technologies de l’information) at the Unversity of Lausanne. Last year, she had the opportunity to take a data protection course and learn more about cyber security and crime in general.

Daniel: Daniel is an exchange student from Koblenz, Germany. Daniel obtained a bachelor’s degree in Business Administration/Management at the WHU - Otto Beisheim School of Management, Germany. He is currently pursuing a Master of Management with focusing on family businesses, entrepreneurship and data science in his courses. Interestingly regarding this project, Daniel spend several months in the United states after high school and thus he can relate to the topic about police violence and crimes in the US.

Motivations

Firstly, from our respective backgrounds, we derive a strong interest in new technologies and privacy. We believe that every person is entitled to the fundamental right to privacy. Unfortunately, one observes an increasing tendency of governments and other stakeholders (e.g. businesses such as GAFA (Google, Amazon, Facebook, Apple)) to take more and more control in our daily lives through digital technologies such as cameras, computers or smartphones. For these reasons it is interesting to ask ourselves if this massive collection of our data leads to more security or more restrictions of our freedom.

Secondly, if we look at European law like the GDPR, collection and processing of our data must be proportionate to the purpose of that processing. Therefore, it is of our interest to determine if these applications are the same in the United States and to see if the installation of cameras, with the objective of security, really allows to reduce crime and to make a city more secure.

Thirdly, it must also be said that crime and the legislative discussions regarding the right to wear a gun in the United-States are fascinating. At first, it seems as if the freedom to carry a gun makes the US more prone to crimes such as mass shootings. To verify or falsify our hypotheses, we also want to see through the datasets we obtained, what kind of crime prevails in American cities and how it evolves according to the districts and their particularities.

Research questions

  1. Does the presence of CCTVs in a given area actually deter crime?
  2. What types of crimes may be deterred by surveillance cameras?
  3. Is the impact of CCTV installation on crime reduction higher/lower/same in higher income neighborhoods compared to lower income neighborhoods?
  4. Are there more public cameras in lower income/higher unemployment areas compared to higher income/employment areas? (Does the government respect privacy issues depending on your income level?)
  5. Do we observe crime displacement issues caused by the installation of CCTV in some neighbourhoods?
  6. Is there a relationship between internet accessibility of a neighbourhood and crimes/CCTV installations?

Data

Data source

We have six raw data sets. All data sets were retrieved on Baltimore government’s open data portal. We found data about crimes committed in Baltimore, CCTV location in the city and poverty ratesm the population and the households with internet access. We also found a data set showing the reference boundaries of the Community Statistical Area geographies. The latter will certainly be helpful to match each data set’s observations together.

Raw Data sets

2.1 Crime Data set

This dataset represents the location and characteristics of major crime against persons such as homicide, shooting, robbery, aggrevated assault etc. within the city of Baltimore. This data set contains 350’294 observations.

  • RowID = ID of the row, 350’294 in total

  • CrimeDateTime = date and time of the crime. Format yyyy/mm/dd hh:mm:sstzd

  • CrimeCode = Code corresponding to the type of crime committed

  • Location = Textual information on where the crime was committed

  • Description = Textual description of the crime committed corresponding to a CrimeCode.

  • Inside/Outside = Provides information on whether crime was committed inside or outside

  • Weapon = Provides details on what weapon has been used, if any

  • Post = Number corresponding to the Police Post concerned. A map with corresponding police posts can be found here: http://moit.baltimorecity.gov/sites/default/files/police_districts_w_posts.pdf?__cf_chl_captcha_tk__=pmd_NhnE710SS8QEWdKOyT5Ug6IJZGoF6iIntFYY30vctes-1634309136-0-gqNtZGzNAxCjcnBszQPl

  • District = Name of the district, regrouping different neighbourhoods. Baltimore is officially divided into nine geographical regions: North, Northeast, East, Southeast, South, Southwest, West, Northwest, and Central.

  • Neighborhood = Name of the neighborhood in which the crime was committed. Most names matches with neighborhood names contained in the dataset about Community Statistical Areas.

  • Latitude = Latitude, Coordinate system: EPSG:4326 WGS 84

  • Longitude = Longitude, Coordinate system: EPSG:4326 WGS 84

  • GeoLocation = Combination of latitude and longitude, Coordinate system: EPSG:4326 WGS 84

  • Premise = Information on the premise where the crime was committed. One counts more than 120’000 observations in the streets.

crime_data <- read.csv(file = here::here("data/Baltimore_Part1_Crime_data.csv"))

Source of the data set: [https://data.baltimorecity.gov/datasets/part1-crime-data/explore]

2.2 CCTV Data set

This dataset represents closed circuit camera locations capturing activity within 256ft (~2 blocks). It contains 837 observations in total.

  • X = Longitude: Coordinate system: EPSG:3857 WGS 84 / Pseudo-Mercator

  • Y = Latitude: Coordinate system: EPSG:3857 WGS 84 / Pseudo-Mercator

  • OBJECTID = ID of of the camera, 837 in total

  • CAM_NUM = Unique number attributed to the camera. This might suggest that the data set does not show the location of every camera in Baltimore.

  • LOCATION = Textual information on where the camera is located

  • PROJ = Name of the area in which the camera is located. It does not always match the name of the “standard” community statistical areas.

  • XCCORD = Longitude, Coordinate system: EPSG:4326 WGS 84

  • YCOORD = Latitude, Coordinate system: EPSG:4326 WGS 84

cctv_data <- read.csv(file = here::here("data/Baltimore_CCTV_Locations_Crime_Cameras.csv"))

Source of the data set: [https://data.baltimorecity.gov/datasets/cctv-locations-crime-cameras/explore]

2.3 Poverty Data set

This dataset provides information about the percent of family households living below the poverty line. This indicator measures the percentage of households whose income fell below the poverty threshold out of all households in an area.

Federal and state governments use such estimates to allocate funds to local communities. Local communities use these estimates to identify the number of individuals or families eligible for various programs. These information will be useful for us to study the dispersion of CCTVs within Baltimore in comparison to the poverty level in a given area. This dataset contains 55 observations, one percentage for each community statistical area. There seems to only be one NA. The most relevant variables are the following:

  • CSA2010 = name of the community statistical area. The Baltimore Data Collaborative and the Baltimore City Department of Planning divided Baltimore into 55 CSAs. These 55 units combine Census Bureau geographies together in ways that match Baltimore’s understanding of community boundaries, and are used in social planning.

  • hhpov15 - hhpov19 = each these five column contains the percent of Family Households Living Below the Poverty Line for a given year, from 2015 to 2019.

  • Shape_Area - Shape_Length = standard fields to determine the area and the perimeter of a polygon

poverty_data <- read.csv(file = here::here("data/Percent_of_Family_Households_Living_Below_the_Poverty_Line.csv"))

Source of the data set: [https://arcg.is/1qOrnH]

2.4 Area Data set

This dataset provides information about the Community Statistical Area geographies for Baltimore City. Based on aggregations of Census tract (2010) geographies. It will serve as a geographical point of reference for us to match each dataset’s observations together. This dataset contains 55 observations, one for each of area. The most relevant variables are the following:

area_data <- read_csv(file = here::here("data/Community_Statistical_Areas__CSAs___Reference_Boundaries.csv"))

Source of the data set: [https://data.baltimorecity.gov/datasets/community-statistical-area-1/explore?location=39.284605%2C-76.620550%2C12.26]

2.5 Population Data set

This data set provides information about the population in each Community Statistical Area. Information about the total population in 2010 and 2020 are provided. It will be useful to calculate values per capita in each community.The most relevant variables are the following:

  • community = name of the community statistical area. The Baltimore Data Collaborative and the Baltimore City Department of Planning divided Baltimore into 55 CSAs. These 55 units combine Census Bureau geographies together in ways that match Baltimore’s understanding of community boundaries, and are used in social planning.

  • tpop20 = total population in for each Community Statistical Area in 2020

population_data <- read.csv(file = here::here("data/Total_Population.csv"))

Source of the data set: [INSERT SOURCE HERE]

2.6 Household Internet Data set

This data set give information about percentage of households with no internet in each of the 55 Community statistical areas. This information is provided for the years 2017, 2018 and 2019. This will be useful to detect whether there is an relationship between internet access and crimes or CCTV installations in neighborhoods. The most important variables are:

  • CSA2010 = name of the community statistical area.

  • nohhint17 = percentage of household in this particular neighboorhood with no internet access in the year 2017.

  • nohhint18 = percentage of household in this particular neighboorhood with no internet access in the year 2018.

  • nohhint19 = percentage of household in this particular neighboorhood with no internet access in the year 2019.

  • Shape_Area = standard fields to determine the area and the perimeter of a polygon

  • Shape_Lenght = standard fields to determine the area and the perimeter of a polygon

Percent_of_Households_with_No_Internet_at_Home <- read.csv(file = here::here("data/Percent_of_Households_with_No_Internet_at_Home.csv"))

Source of the data set: [INSERT SOURCE HERE]

2.7 Data Wrangling

2.7.1 Data Wrangling: Area

Here, the main goal is the transformation of the area data set into a new data set, which contains one observation per neighborhood. Indeed, it is important to distinguish neighborhoods which are smaller areas from communities, which are larger and often contain several neighborhoods. We achieve that by first creating a new data set with each neighborhood being assigned to a community using separate_rows and second establishing a new columns with lower case letter for later merge.To do so, we combine the mutate function with tolower which convert the uppercase letters of string to a lowercase string.

area_data2 <- separate_rows(area_data, Neigh, sep = ", ") #Creation of a new data set with each neighborhood being assigned to an area

area_data2 <- mutate(area_data2,neigh=tolower(Neigh)) #Creation of new column with lower case letters

2.7.2 Data Wrangling: Crime

As in the crime data set the neighborhood names are written in lower case letters we again create a column with lower case letters to join the two data sets. We join the area data set and the crime data set using left_join. Next, we use the anti_join function to understand which observation has not matched. The outcome shows all the neighborhoods which did not match. As shown below, the issues mostly come from spelling difference (e.g.: Mount written Mt.). As we have very few observations which do not match, we change the names manually.

  • mount washington \(→\) Mt. Washington
  • carroll - camden industrial area \(→\) Caroll-Camden Industrial Area
  • patterson park neighborhood \(→\) Patterson Park
  • glenham-belhar \(→\) Glenham-Belford
  • new southwest/mount clare \(→\) Hollins Market
  • mount winans \(→\) Mt.Winans
  • rosemont homeowners/tenants \(→\) Rosemont
  • broening manor \(→\) O’Donnell Heights
  • boyd-booth \(→\) Booth-boyd
  • lower herring run park \(→\) Herring Run Park
  • mt pleasant park \(→\) Mt. Pleasant Park
crime_data <- mutate(crime_data,neigh=tolower(crime_data$Neighborhood)) #Creation of new column with lower case letters

crime_data_with_areas <- crime_data %>% 
  left_join(area_data2,by="neigh") #We create a new data sets that contains the name of the area in which the crime was committed

crime_data_NAs <- crime_data %>% 
  anti_join(area_data2,
            by="neigh") #Here is the list of all the NAs we have

unique(crime_data_NAs$neigh) #We see that we have very few unassigned names, we can change this by hand.

crime_data["neigh"][crime_data["neigh"]=="mount washington"] <- "mt. washington"
crime_data["neigh"][crime_data["neigh"]=="carroll - camden industrial area"] <- "caroll-camden industrial area"
crime_data["neigh"][crime_data["neigh"]=="patterson park neighborhood"] <- "patterson park"
crime_data["neigh"][crime_data["neigh"]=="glenham-belhar"] <- "glenham-belford"
crime_data["neigh"][crime_data["neigh"]=="new southwest/mount clare"] <- "hollins market"
crime_data["neigh"][crime_data["neigh"]=="mount winans"] <- "mt. winans"
crime_data["neigh"][crime_data["neigh"]=="rosemont homeowners/tenants"] <- "rosemont"
crime_data["neigh"][crime_data["neigh"]=="broening manor"] <- "o'donnell heights"
crime_data["neigh"][crime_data["neigh"]=="boyd-booth"] <- "booth-boyd"
crime_data["neigh"][crime_data["neigh"]=="lower herring run park"] <- "herring run park"
crime_data["neigh"][crime_data["neigh"]=="mt pleasant park"] <- "mt. pleasant park"

#We got rid of the 764 remaining observations which had no information about neighbourhood

We get rid of the 764 remaining observations which had no information about neighborhood. This represent a very tiny portion of our total number of observations. Finally, we use the semi join function to create the final data sets which in total is basically the same data set as the original one minus the 764 observations.

Finally, we want to get rid of the observations dating before 2000, as the the Baltimore CCTV program started in the year 2000. We first check the structure of the data set using the str function. We notice that the CrimeDateTime column is not a date. We change that and finally filter the information we want to keep using filter.

crime_data_with_areas <- crime_data %>% 
 semi_join(area_data2,by="neigh") %>% 
  left_join(area_data2,by="neigh") #Here we have the final data frame with a community for each crime

str(crime_data_with_areas) # We see that the crime CrimeDateTime column is not a date. We thus convert it.

crime_data_with_areas$CrimeDateTime <-  as.Date(crime_data_with_areas$CrimeDateTime)

crime_data_with_areas <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2000-01-01")) #We had 24 observations that dates back to before the year 2000 and 24 observation with no date. We only select crime committed after 2000 as the CCTV program in Baltimore started in 2000.

2.7.3 Data Wrangling: Poverty

56 areas are included in the standard community statistical area system. However, within these 56 statistical areas is also jail included. For the poverty data however, we obviously have only 55 statistical areas provided, since we obviously do not have data about poverty in jail. To solve this inconsistency, we add a new line. Moreover we needed to fill a missing value for Baltimore in the year 2019: Here we took the average of the past years.

poverty_data <- rbind(poverty_data,list(56,"Unassigned -- Jail",0,0,0,0,0,0,0))

poverty_data[48,7] <- c(poverty_data[48,3],poverty_data[48,4],poverty_data[48,5],poverty_data[48,6]) %>% mean() #The poverty rate of South Baltimore in 19 was missing. This area's rate over the past years seems to be stable (always one of the richest area), that's why we compute the mean of the past 4 years to replace the missing value.

2.7.4 Data Wrangling: CCTV

This data set seems rather tidy, we will mostly use the first two columns which contain information about the location of each CCTV. Therefore,we still need to make sure to not have any missing values in these two columns. We do so by combination the whichand the is.nafunction and by filtering for potential empty observations.

which(is.na(cctv_data$X))
#> integer(0)
which(is.na(cctv_data$Y))
#> integer(0)
filter(cctv_data, cctv_data$X=="")
#>  [1] X                Y                OBJECTID        
#>  [4] CAM_NUM          NOTES            LOCATION        
#>  [7] PROJ             XCOORD           YCOORD          
#> [10] created_user     created_date     last_edited_user
#> [13] last_edited_date
#> <0 rows> (or 0-length row.names)
filter(cctv_data, cctv_data$Y=="") 
#>  [1] X                Y                OBJECTID        
#>  [4] CAM_NUM          NOTES            LOCATION        
#>  [7] PROJ             XCOORD           YCOORD          
#> [10] created_user     created_date     last_edited_user
#> [13] last_edited_date
#> <0 rows> (or 0-length row.names)
#We are not sure it is the proper technique but by doing so we ensure that we have no NAs neither empty values and so that our data set is tidy.

2.7.5 Data Wrangling: Household internet in CSA’s

On the first sight, this Household internet datasets from Baltimore looks very tidy. Nevertheless, we quickly run some code to try filter out missing values or detect anomalies.

sum(is.na(Percent_of_Households_with_No_Internet_at_Home))
#> [1] 0

Having examined the sum of NA’s, we see that this dataset is clean and since there are only 55 rows, jail was also automatically not included when configurating this dataset about interenet access (which makes sense, since jail has probably it own internet access, but has no households to count).

Exploratory data analysis

3.1 Calculation of the density of CCTV per community

The original CCTV data set which we observed had a slight challenge. Although it contained some neighborhood names, most of them were not matching the “standard neighborhood” names. There, to solve that we involved geospatial counting.

Our procedure included the following steps. After reading the table and converting the data into a data table, we define what will be the coordinates of the newly created spatial file. Here we have several types of coordinates, we use X and Y which use the EPSG:3857 WGS 84 / Pseudo-Mercator coordinate system. Spatial files must have coordinate systems assigned to them. In the case at hand, we will work with the above mentioned EPSG:3857 WGS 84 / Pseudo-Mercator coordinate system for all the spatial files that we are going to use. Therefore, to ensure consistency, we create a crs object called crs.geo1 that is going to be assigned to all the spatial files we will use. In order to assign a known crs to spatial data, we use the proj4string function, to which we assign crs.geo1.

#read in data table
balt_dat <-  fread(file = here::here("data/Baltimore_CCTV_Locations_Crime_Cameras.csv"))

#convert to data table
balt_dat <- as.data.table(balt_dat)

#make data spatial
coordinates(balt_dat) <-  c("X","Y")
crs.geo1 <-  CRS("+proj=merc +a=6378137 +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs +type=crs")
proj4string(balt_dat) <-  crs.geo1  

Then we plot to see the output (as cloud of points which represent all the CCTVs).

plot(balt_dat, pch = 20, col = "steelblue") #We can use the plot function to quickly plot the SpatialPointDataFrame that we created. We see a bunch of points which represent the CCTV location in Baltimore.

Next, we have to work with the shapefile which is another special type of file. Basically it is a set of polygons which represent different areas of the city Baltimore. We downloaded this file on the Open Baltimore Portal. We read it in and assign this file again to our crs.geo1 coordinate system. In this way we have assured that our files have the same coordinate system.

#read in shapefile of baltimore
baltimore <-  readOGR(dsn = here::here("data/Community_Statistical_Area"), layer = "Community_Statistical_Area") #name of file and object
proj4string(baltimore) <- crs.geo1

We can now plot these two spatial files together to see the spread of CCTVs over the 56 community statistical areas.

#plot
plot(baltimore,main="Spread of CCTVs in different communities of Baltimore")
plot(balt_dat,pch=20, col="steelblue" , add=TRUE) #If we plot these two lines together, what we obtain is a map of baltimore, we have the 56 community statistical areas and the CCTVs on top of the map.

To illustrate these results numerically, we need R to count for us how many CCTV belongs to which area. Here, the function over counts how many CCTVs are layed over a certain polygon frame. Next, we create a new object called counts and make it into a data frame (so that it is easier for us to work with it). We use sum to ensure that we well and truly have 836 observations which were counted. This is the case so we are happy. Still we notice that we only have 41 rows, meaning there here are only 41 out of 56 areas where there are some CCTV.

#Perform the count
proj4string(balt_dat)
proj4string(baltimore) #To be able to perform the count, we must ensure that the two spatial files have a similar CRS. This is the case as we attributed these two files "crs.geo1" 

res2 <- over(balt_dat,baltimore) #This function tells you to which community each CCTV belongs to
counts <- table(res2$community)
counts <- as.data.frame(counts)
colnames(counts)[1] <- "Community"
sum(counts$Freq) #We see that we have 836 observation in total, this is a good sign as our initial CCTV data set contained 836 obesrvations

To make that workable, we need to create a new CCTV file, from which we just add 0 to each N.A.-location. Lastly, we create a new column with the mutate function to calculate the CCTV-density which shows the amount of CCTV per area divided by the total amount of CCTV.

CCTV_per_area <- area_data[2] %>% 
  left_join(counts,by="Community") #One must add the communities where there are no counts i.e no CCTV

CCTV_per_area[is.na(CCTV_per_area)] <- 0

CCTV_per_area <- mutate(CCTV_per_area, density_perc=(CCTV_per_area$Freq/(sum(CCTV_per_area$Freq)))*100)

3.1.1 Mapping of CCTV density

We now want to map CCTV density on the Baltimore map. We first have to use the piping operator to ensure that the community that we have in the Baltimore data set are the same as the one we are having in the CCTV per are data set. As this only returns true values that means that it works and is good for further analysis.

library(tmap)
baltimore$community %in% CCTV_per_area$Community

Next, we perform a left_join between the Baltimore shape file and the CCTV per area data set. To hedge against the different writing styles (one time it is written with a capital letter and one time with a small letter), we use the vector in the end. Finally, we create the map with the tmap package. The tmap package somehow works as the ggplot2 package: First, we need to define an element, it always starts with the tm_shape argument, and then you can add with the plus operator as many arguments as you wish. We used the Baltimore shape file, filled it with the density percentage, defined some breaks, set the borders and the finally the layout.

baltimore@data <- left_join(baltimore@data, CCTV_per_area, by = c('community' = 'Community'))

CCTV_dens_map <- tm_shape(baltimore) + tm_fill(col = "density_perc", title ="CCTV density per Area in %", breaks=c(0,1,2,3,4,5,6,7,8,9,10,11)) + tm_borders(col="black",alpha=0.3) + tm_layout(inner.margins = 0.05)

tmap_mode("plot")
CCTV_dens_map

3.2 Calculation of the crime per capita per community

What we create is the CrimeStatsPerArea. To achieve that we group the crime_data_with_areas data set by community and then use summarize which enables us to compute the crime frequency for each area. Then, using the population data, we can divide the crime frequency by the number of inhabitants in each area. We finally multiply this by 1000 to obtain the crime per 1000 inhabitants. Again, we added one more row in the calculations because we have no values for the prison. To make sure we made no mistake, we add up the CrimeFrequency column to see whether it equals to 349482. This is the case. We can therefore go further confidently.

CrimeStatsPerArea <- crime_data_with_areas %>% 
  group_by(Community) %>%
  summarize(CrimeFrequency=n())

CrimeStatsPerArea <-  mutate(CrimeStatsPerArea,CrimePer1000inhabitants=((CrimeStatsPerArea$CrimeFrequency/population_data$tpop20)*1000))

CrimeStatsPerArea <- rbind(CrimeStatsPerArea,list("Unassigned -- Jail",0,0))  #We have no information about crimes committed in jail, yet, the community statistical area encompass 56 area, including jail. In order to ensure consistency, we must add a 56th observation in this data frame.

sum(CrimeStatsPerArea$CrimeFrequency) #The total sum is 349482, which is what we expect

Community_data <- CrimeStatsPerArea[,-2] %>% 
  left_join(CCTV_per_area,by="Community") %>%
  left_join(poverty_data[,c(2,7)],by=c("Community"="CSA2010"))

3.2.1 Mapping of crime per capita per community

We want to map crimes per capita per community. The methodology is the same as we did for CCTV density. This time, we use the “quantile” method to create category breaks.

library(tmap)

baltimore$community %in% CrimeStatsPerArea$Community #We see that we have a perfect match

baltimore@data <- left_join(baltimore@data, CrimeStatsPerArea, by = c('community' = 'Community'))

Crime_per_capita_map <- tm_shape(baltimore) + tm_fill(col = "CrimePer1000inhabitants", title ="Crime per capita",style = "quantile") + tm_borders(col="black",alpha=0.3) + tm_layout(inner.margins = 0.05)

tmap_mode("plot")
Crime_per_capita_map

3.2.2 Creation of a distorted map

To observe crime per capita per community repartition in Baltimore visually, we decided to use a distorted map. Again, we use the tmap package together with the cartogram_ncont function which basically distort the map based on intensity of crime per capita in each community. Concretely, we want to show that the crime per capita is higher in the city center, compared to the suburban areas. This can be shown quite neatly graphically.

Distorted_Crime_map <- tm_shape(cartogram_ncont(baltimore, "CrimePer1000inhabitants"))+tm_fill(col = "CrimePer1000inhabitants", title ="Crime per capita per Area in %",style = "quantile") + tm_borders(col="black",alpha=0.3) + tm_layout(inner.margins = 0.07) #This map distorts the size of each area depending on their respective crime per capita It is interesting as it enables one to see that higher crime per capita tends to be concentrated in the city center.

tmap_mode("plot")
Distorted_Crime_map

3.2.3 The prison anomaly

One may wonder what this little square with no crime per capita, surrounded by CCTVs in the very center of Baltimore is. It actually is the prison. Zooming on this little square is interesting. The reason why we see so many CCTVs located on the right-hand side of the map is that the main entrance is located there.

tmap_mode("plot")

Prison_area <-  st_bbox(c(xmin = -8529169.92, xmax = -8526465.97,
                      ymin =4764196.55, ymax = 4765056.50),
                    crs = st_crs(baltimore)) %>% st_as_sfc()
 
Prison_map <- tm_shape(Prison_area) + tm_borders(col="black",alpha=0.3)+ tm_shape(baltimore) + tm_fill(col = "CrimePer1000inhabitants", title ="Crime per capita per Area",style = "quantile") + tm_borders(col="black") + tm_layout(inner.margins = 0.05,frame.lwd = 5,title = "Zoom on Baltimore Prison",title.position = c('left', 'top'))+tm_scale_bar(position = c("left", "top"))+ tm_shape(balt_dat) + tm_dots(col="black") #This map zooms on the prison. This "Area" is special. We have no data on crime there, we can also see that the there is a huge concentration of CCTVs directly next to the prison.


Baltimore_map <- tm_shape(baltimore) + tm_borders()+ tm_shape(Prison_area) + tm_borders(lwd = 3,col = "red") + tm_layout(frame.lwd = 6,inner.margins = 0.05)


Prison_map
print(Baltimore_map, vp = viewport(0.8, 0.27, width = 0.5, height = 0.5)) #By running these two lines together, we obtain 

3.3 Calculation of crime per capita by type of crime

First thing we do here is to compute the unique values of the description column of the crime data set. We see that we have 14 types of crime. We want to observe crimes by types, therefore we want to make new classifications. The law consists of three basic classifications of criminal offenses including infractions, misdemeanors, and felonies. In our data set, we have no infractions. The 14 types of crime are divided in this way into the two remaining categories.

  • Misdemeanor: LARCENY FROM AUTO,COMMON ASSAULT, ROBBERY - COMMERCIAL, LARCENY
  • Felony: RAPE, ARSON, HOMICIDE, BURGLARY, AUTO THEFT, ROBBERY - CARJACKING, AGG. ASSAULT, ROBBERY - STREET, ROBBERY - RESIDENCE, SHOOTING
unique(crime_data_with_areas$Description)

#We see that we have 14 types of crime. We want to observe crimes by types, therefore we want to make new classifications.The law consists of three basic classifications of criminal offenses including infractions, misdemeanors, and felonies. In our data set, we have infractions.

#Misdemeanor:LARCENY FROM AUTO,COMMON ASSAULT, ROBBERY - COMMERCIAL, LARCENY
#Felony: RAPE, ARSON, HOMICIDE, BURGLARY, AUTO THEFT, ROBBERY - CARJACKING, AGG. ASSAULT, ROBBERY - STREET, ROBBERY - RESIDENCE, SHOOTING

Next we create a data set which is called crime_cat and basically tells you which recorded crime type belongs to which crime category. This data set will be used later to make a left joint with the crime_data_per_area. Finally, we are left with the crime data sets with the area datas et with a new column which concerns whether the crime was a felony or a misdemeanor.

crime_cat <- data.frame(Category=c("Misdemeanor","Felony"), Description=c(c("LARCENY FROM AUTO,COMMON ASSAULT,ROBBERY - COMMERCIAL,LARCENY"),c("RAPE,ARSON,HOMICIDE,BURGLARY,AUTO THEFT,ROBBERY - CARJACKING,AGG. ASSAULT,ROBBERY - STREET,ROBBERY - RESIDENCE,SHOOTING")))

crime_cat <- separate_rows(crime_cat, Description, sep = ",")

crime_cat$Description %in% unique(crime_data_with_areas$Description) #Ensure we have a perfect match

crime_data_with_areas <- crime_data_with_areas %>% 
  left_join(crime_cat,by="Description") #We had a new variable to our crime data set

Next, we compute the Crime_PerCategory_PerArea. Here we use the piping operator and this time we group by the community and category and obtain the results. Again, we check that we indeed have 349482 observations. Moreover, from that we compute both felony and misdemeanors per capita in each community and (again) add the prison line into the newly created data sets.

CrimePerCategoryPerArea <- crime_data_with_areas %>% 
  group_by(Community,Category) %>%
  summarize(RepartitionPerCategoryPerArea=n())

sum(CrimePerCategoryPerArea$RepartitionPerCategoryPerArea) #Again, we check that we indeed have 349482 observations

CrimeCategoryRepartition <- CrimePerCategoryPerArea %>% 
  group_by(Category) %>% 
  summarise(Repartition=sum(RepartitionPerCategoryPerArea)) #We observe that in Baltimore, the number of felony is close to the number of misdemeanor

FelonyStats <-  CrimePerCategoryPerArea %>% filter(Category=="Felony") 

FelonyStats$FelonyPerCapitaPerArea <-((CrimePerCategoryPerArea%>% filter(Category=="Felony"))[[3]]/population_data$tpop20)*1000

FelonyStats[56,] <- list("Unassigned -- Jail","Felony",0,0)

MisdemeanorStats <-  CrimePerCategoryPerArea %>% filter(Category=="Misdemeanor") 

MisdemeanorStats$MisdemeanorPerCapitaPerArea <-((CrimePerCategoryPerArea%>% filter(Category=="Misdemeanor"))[[3]]/population_data$tpop20)*1000

MisdemeanorStats[56,] <- list("Unassigned -- Jail","Misdemeanor",0,0)

Community_data <- Community_data %>% 
  left_join(FelonyStats[,-c(2:3)],by="Community") %>%
  left_join(MisdemeanorStats[,-c(2:3)],by="Community")

As mentioned earlier, it is also possible to divide the crimes committed in Baltimore by ‘type’ of crime. A distinction is generally made between property crime and violent crime. In a property crime, a victim’s property is stolen or destroyed, without the use or threat of force against the victim. Property crimes include burglary and theft as well as vandalism and arson. In a violent crime, a victim is harmed by or threatened with violence. Violent crimes include rape and sexual assault, robbery, assault and murder.

In order determine whether the crimes contained in our crime_data_with_area. We will use a data set once again provided by the Baltimore open data portal. This data set provides information about the crime codes used by the police to categorize crimes. We first import the data set. Then, we compare whether codes are well and truly similar, three crime codes are written with an extra blank space afterward. We correct that. Then, suing the left_join function, we add a new column to our crime_data_with_area data frame. We then wish to create data frames for both violent and property crime. The methodology is the same as we used for felonies and misdemeanors.

crimecode_data <- read.csv(file = here::here("data/Balt_CRIME_CODES.csv"))

unique(crime_data_with_areas$CrimeCode) %in% unique(crimecode_data$CODE) #We identify spelling errors

crimecode_data$CODE[185] <- "8H"
crimecode_data$CODE[186] <- "8I"
crimecode_data$CODE[187] <- "8J"

crime_data_with_areas <- crime_data_with_areas %>% 
  left_join(crimecode_data[,c(1,8)],by=c("CrimeCode"="CODE"))

unique(crime_data_with_areas$VIO_PROP_CFS)
which(is.na(crime_data_with_areas$VIO_PROP_CFS)) #We ensure that we have no NAs

CrimePerCategory2PerArea <- crime_data_with_areas %>% 
  group_by(Community,VIO_PROP_CFS) %>%
  summarize(RepartitionPerCategory2PerArea=n())

sum(CrimePerCategory2PerArea$RepartitionPerCategory2PerArea) #Again, we check that we indeed have 349482 observations

CrimeCategory2Repartition <- CrimePerCategory2PerArea %>% 
  group_by(VIO_PROP_CFS) %>% 
  summarise(Repartition=sum(RepartitionPerCategory2PerArea))

PropertyStats <-  CrimePerCategory2PerArea %>% filter(VIO_PROP_CFS=="PROPERTY") 

PropertyStats$PropertyCrimePerCapitaPerArea <-((CrimePerCategory2PerArea%>% filter(VIO_PROP_CFS=="PROPERTY"))[[3]]/population_data$tpop20)*1000

PropertyStats[56,] <- list("Unassigned -- Jail","PROPERTY",0,0)

ViolentStats <-  CrimePerCategory2PerArea %>% filter(VIO_PROP_CFS=="VIOLENT") 

ViolentStats$ViolentCrimePerCapitaPerArea <-((CrimePerCategory2PerArea%>% filter(VIO_PROP_CFS=="VIOLENT"))[[3]]/population_data$tpop20)*1000

ViolentStats[56,] <- list("Unassigned -- Jail","PROPERTY",0,0)

Community_data <- Community_data %>% 
  left_join(ViolentStats[,c(1,4)],by="Community") %>% 
  left_join(PropertyStats[,c(1,4)],by="Community")

3.3.1 Mapping of felonies and Misdemeanors

After ensuring that we have a perfect match we perform a left joint for felony and misdemeanor and map everything.

#Felony

baltimore$community %in% FelonyStats$Community

baltimore@data <- left_join(baltimore@data, FelonyStats, by = c('community' = 'Community'))

Felony_map <- tm_shape(baltimore) + tm_fill(col = "FelonyPerCapitaPerArea", title ="Felony per capita per Area in %",style = "quantile") + tm_borders(col="black",alpha=0.3) + tm_layout(inner.margins = 0.05)

Felony_map

#Misdemeanor

baltimore$community %in% MisdemeanorStats$Community

baltimore@data <- left_join(baltimore@data, MisdemeanorStats, by = c('community' = 'Community'))

Misdemeanor_map <- tm_shape(baltimore) + tm_fill(col = "MisdemeanorPerCapitaPerArea", title ="Misdemeanor per capita per Area in %",style = "quantile") + tm_borders(col="black",alpha=0.3) + tm_layout(inner.margins = 0.05)

Misdemeanor_map

3.4 Calculation of crime evolution

The idea is that we want to get information about how crime evolved. Here we could have done a loop, but could not yet find a way to properly do it. We have created a data set for each year. The results are interesting. If we compare how many observations we have in each crime-per year data sets, we see that we have ~40.000ish cases a year except from 2020 (which is due to COVID) and the year 2021 (which is not finished. We don’t make any datasets for the year 2013 and below, because we see that we have not many observations which date prior to the year 2013. The graph represent the monthly evolution of crime for each year. We see that there seems to be a sort of pattern and that, each year, crime increases mid-year before decreasing in december.

Crime_in_2021 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2021-01-01") & CrimeDateTime <= as.Date("2021-12-31"))

Crime_in_2020 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2020-01-01") & CrimeDateTime <= as.Date("2020-12-31"))

Crime_in_2019 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2019-01-01") & CrimeDateTime <= as.Date("2019-12-31"))

Crime_in_2018 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2018-01-01") & CrimeDateTime <= as.Date("2018-12-31"))

Crime_in_2017 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2017-01-01") & CrimeDateTime <= as.Date("2017-12-31"))

Crime_in_2016 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2016-01-01") & CrimeDateTime <= as.Date("2016-12-31"))

Crime_in_2015 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2015-01-01") & CrimeDateTime <= as.Date("2015-12-31"))

Crime_in_2014 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2014-01-01") & CrimeDateTime <= as.Date("2014-12-31"))

crime_data_with_areas %>%  filter(CrimeDateTime < as.Date("2014-01-01")) #We see that we have very few (76) observations before 2014, thus we do not consider them

Crime_Monthly_evolution_map <- crime_data_with_areas %>% 
  count(month=floor_date(CrimeDateTime,"month")) %>% 
  ggplot(aes(month,n))+geom_line()+
  scale_x_date(limits = c(as.Date("2014-01-01"), as.Date("2021-08-31"))) #This enables us to see how crime evolve, month after month

Crime_Monthly_evolution_map

Next, we calculate the crime per capita for each year with the piping operator, grouping by community and summarize the rates. In the end we create the crime evolution data sets which is a combination of all the data.

#_____ Calculations of the crime rates

CrimePerCapitaPerArea2021 <- Crime_in_2021 %>% 
  group_by(Community) %>%
  summarize(CrimeFrequency21=n())

CrimePerCapitaPerArea2021 <-  mutate(CrimePerCapitaPerArea2021,CrimePer1000inhabitants21=((CrimePerCapitaPerArea2021$CrimeFrequency21/population_data$tpop20)*1000))

CrimePerCapitaPerArea2021 <- rbind(CrimePerCapitaPerArea2021,list("Unassigned -- Jail",0,0))

CrimePerCapitaPerArea2020 <- Crime_in_2020 %>% 
  group_by(Community) %>%
  summarize(CrimeFrequency20=n())

CrimePerCapitaPerArea2020 <-  mutate(CrimePerCapitaPerArea2020,CrimePer1000inhabitants20=((CrimePerCapitaPerArea2020$CrimeFrequency20/population_data$tpop20)*1000))

CrimePerCapitaPerArea2020 <- rbind(CrimePerCapitaPerArea2020,list("Unassigned -- Jail",0,0))

CrimePerCapitaPerArea2019 <- Crime_in_2019 %>% 
  group_by(Community) %>%
  summarize(CrimeFrequency19=n())

CrimePerCapitaPerArea2019 <-  mutate(CrimePerCapitaPerArea2019,CrimePer1000inhabitants19=((CrimePerCapitaPerArea2019$CrimeFrequency19/population_data$tpop20)*1000))

CrimePerCapitaPerArea2019 <- rbind(CrimePerCapitaPerArea2019,list("Unassigned -- Jail",0,0))

CrimePerCapitaPerArea2018 <- Crime_in_2018 %>% 
  group_by(Community) %>%
  summarize(CrimeFrequency18=n())

CrimePerCapitaPerArea2018 <-  mutate(CrimePerCapitaPerArea2018,CrimePer1000inhabitants18=((CrimePerCapitaPerArea2018$CrimeFrequency18/population_data$tpop20)*1000))

CrimePerCapitaPerArea2018 <- rbind(CrimePerCapitaPerArea2018,list("Unassigned -- Jail",0,0))

CrimePerCapitaPerArea2017 <- Crime_in_2017 %>% 
  group_by(Community) %>%
  summarize(CrimeFrequency17=n())

CrimePerCapitaPerArea2017 <-  mutate(CrimePerCapitaPerArea2017,CrimePer1000inhabitants17=((CrimePerCapitaPerArea2017$CrimeFrequency17/population_data$tpop20)*1000))

CrimePerCapitaPerArea2017 <- rbind(CrimePerCapitaPerArea2017,list("Unassigned -- Jail",0,0))

CrimePerCapitaPerArea2016 <- Crime_in_2016 %>% 
  group_by(Community) %>%
  summarize(CrimeFrequency16=n())

CrimePerCapitaPerArea2016 <-  mutate(CrimePerCapitaPerArea2016,CrimePer1000inhabitants16=((CrimePerCapitaPerArea2016$CrimeFrequency16/population_data$tpop20)*1000))

CrimePerCapitaPerArea2016 <- rbind(CrimePerCapitaPerArea2016,list("Unassigned -- Jail",0,0))

CrimePerCapitaPerArea2015 <- Crime_in_2015 %>% 
  group_by(Community) %>%
  summarize(CrimeFrequency15=n())

CrimePerCapitaPerArea2015 <-  mutate(CrimePerCapitaPerArea2015,CrimePer1000inhabitants15=((CrimePerCapitaPerArea2015$CrimeFrequency15/population_data$tpop20)*1000))

CrimePerCapitaPerArea2015 <- rbind(CrimePerCapitaPerArea2015,list("Unassigned -- Jail",0,0))

CrimePerCapitaPerArea2014 <- Crime_in_2014 %>% 
  group_by(Community) %>%
  summarize(CrimeFrequency14=n())

CrimePerCapitaPerArea2014 <-  mutate(CrimePerCapitaPerArea2014,CrimePer1000inhabitants14=((CrimePerCapitaPerArea2014$CrimeFrequency14/population_data$tpop20)*1000))

CrimePerCapitaPerArea2014 <- rbind(CrimePerCapitaPerArea2014,list("Unassigned -- Jail",0,0))

crime_evolution <- CrimePerCapitaPerArea2021 %>% 
  left_join(CrimePerCapitaPerArea2020,by="Community") %>% 
  left_join(CrimePerCapitaPerArea2019,by="Community") %>%
  left_join(CrimePerCapitaPerArea2018,by="Community") %>%
  left_join(CrimePerCapitaPerArea2017,by="Community") %>% 
  left_join(CrimePerCapitaPerArea2016,by="Community") %>% 
  left_join(CrimePerCapitaPerArea2015,by="Community") %>% 
  left_join(CrimePerCapitaPerArea2014,by="Community")

Community_data <- Community_data %>% 
  left_join(crime_evolution,by="Community")

Another interesting way to visualise how crime evolved is by using an animated map. We can create animated maps using the tmap_animation function. Yet, in order to be in position to use it, we have to create a very particular tibble. In the case at hand, we want our animated map to display crime per capita evolution over 7 years (from 2014 to 2020, we get ride of 2021 as the year is not complete). Therefore, we must have 7 x 56 observations, one crime per capita value for each year, for each 56 area. Yet, the tibble becomes a bit more peculiar as for each observation, we have to add a in a separate column, a polygon (which is an S4 element) corresponding to the area in question. It is not possible to use a function like the rep function to replicate S4 elements, therefore, we had to do that manually.

Once the tibble is built, we want to merge the data contained in it in a SpatialPolygonsDataFrame. We want to use the baltimore SpatialPolygonsDataFrame.However, as the tibble contains 392 observations, this will enlarge our our SpatialPolygonsDataFrame. As the baltimore object is also used for other purposes, we create an alias. Then, we merge the newly created tibble with the newly created alias, simply using left_join. We create the bbox object as well as an object called pb. The first element allows us to delimit the geographical area of interest and the second allows us to create custom classes. Finally, we crate a map using the tm_shape function. We animate the latter using tmap_animation.

anim_tibble <-  tibble(Year=rep(2020:2014,56),Community=rep(Community_data$Community,each=7),CrimeRate=as.vector(t(crime_evolution[,-c(1,2,3,4,6,8,10,12,14,16)])),geometry=list(
  baltimore@polygons[[1]],baltimore@polygons[[1]],baltimore@polygons[[1]],baltimore@polygons[[1]],baltimore@polygons[[1]],baltimore@polygons[[1]],baltimore@polygons[[1]],
  baltimore@polygons[[2]],baltimore@polygons[[2]],baltimore@polygons[[2]],baltimore@polygons[[2]],baltimore@polygons[[2]],baltimore@polygons[[2]],baltimore@polygons[[2]],
  baltimore@polygons[[3]],baltimore@polygons[[3]],baltimore@polygons[[3]],baltimore@polygons[[3]],baltimore@polygons[[3]],baltimore@polygons[[3]],baltimore@polygons[[3]],
  baltimore@polygons[[4]],baltimore@polygons[[4]],baltimore@polygons[[4]],baltimore@polygons[[4]],baltimore@polygons[[4]],baltimore@polygons[[4]],baltimore@polygons[[4]],
  baltimore@polygons[[5]],baltimore@polygons[[5]],baltimore@polygons[[5]],baltimore@polygons[[5]],baltimore@polygons[[5]],baltimore@polygons[[5]],baltimore@polygons[[5]],
  baltimore@polygons[[6]],baltimore@polygons[[6]],baltimore@polygons[[6]],baltimore@polygons[[6]],baltimore@polygons[[6]],baltimore@polygons[[6]],baltimore@polygons[[6]],
  baltimore@polygons[[7]],baltimore@polygons[[7]],baltimore@polygons[[7]],baltimore@polygons[[7]],baltimore@polygons[[7]],baltimore@polygons[[7]],baltimore@polygons[[7]],
  baltimore@polygons[[8]],baltimore@polygons[[8]],baltimore@polygons[[8]],baltimore@polygons[[8]],baltimore@polygons[[8]],baltimore@polygons[[8]],baltimore@polygons[[8]],
  baltimore@polygons[[9]],baltimore@polygons[[9]],baltimore@polygons[[9]],baltimore@polygons[[9]],baltimore@polygons[[9]],baltimore@polygons[[9]],baltimore@polygons[[9]],
  baltimore@polygons[[10]],baltimore@polygons[[10]],baltimore@polygons[[10]],baltimore@polygons[[10]],baltimore@polygons[[10]],baltimore@polygons[[10]],baltimore@polygons[[10]],
  baltimore@polygons[[11]],baltimore@polygons[[11]],baltimore@polygons[[11]],baltimore@polygons[[11]],baltimore@polygons[[11]],baltimore@polygons[[11]],baltimore@polygons[[11]],
  baltimore@polygons[[12]],baltimore@polygons[[12]],baltimore@polygons[[12]],baltimore@polygons[[12]],baltimore@polygons[[12]],baltimore@polygons[[12]],baltimore@polygons[[12]],
  baltimore@polygons[[13]],baltimore@polygons[[13]],baltimore@polygons[[13]],baltimore@polygons[[13]],baltimore@polygons[[13]],baltimore@polygons[[13]],baltimore@polygons[[13]],
  baltimore@polygons[[14]],baltimore@polygons[[14]],baltimore@polygons[[14]],baltimore@polygons[[14]],baltimore@polygons[[14]],baltimore@polygons[[14]],baltimore@polygons[[14]],
  baltimore@polygons[[15]],baltimore@polygons[[15]],baltimore@polygons[[15]],baltimore@polygons[[15]],baltimore@polygons[[15]],baltimore@polygons[[15]],baltimore@polygons[[15]],
  baltimore@polygons[[16]],baltimore@polygons[[16]],baltimore@polygons[[16]],baltimore@polygons[[16]],baltimore@polygons[[16]],baltimore@polygons[[16]],baltimore@polygons[[16]],
  baltimore@polygons[[17]],baltimore@polygons[[17]],baltimore@polygons[[17]],baltimore@polygons[[17]],baltimore@polygons[[17]],baltimore@polygons[[17]],baltimore@polygons[[17]],
  baltimore@polygons[[18]],baltimore@polygons[[18]],baltimore@polygons[[18]],baltimore@polygons[[18]],baltimore@polygons[[18]],baltimore@polygons[[18]],baltimore@polygons[[18]],
  baltimore@polygons[[19]],baltimore@polygons[[19]],baltimore@polygons[[19]],baltimore@polygons[[19]],baltimore@polygons[[19]],baltimore@polygons[[19]],baltimore@polygons[[19]],
  baltimore@polygons[[20]],baltimore@polygons[[20]],baltimore@polygons[[20]],baltimore@polygons[[20]],baltimore@polygons[[20]],baltimore@polygons[[20]],baltimore@polygons[[20]],
  baltimore@polygons[[21]],baltimore@polygons[[21]],baltimore@polygons[[21]],baltimore@polygons[[21]],baltimore@polygons[[21]],baltimore@polygons[[21]],baltimore@polygons[[21]],
  baltimore@polygons[[22]],baltimore@polygons[[22]],baltimore@polygons[[22]],baltimore@polygons[[22]],baltimore@polygons[[22]],baltimore@polygons[[22]],baltimore@polygons[[22]],
  baltimore@polygons[[23]],baltimore@polygons[[23]],baltimore@polygons[[23]],baltimore@polygons[[23]],baltimore@polygons[[23]],baltimore@polygons[[23]],baltimore@polygons[[23]],
  baltimore@polygons[[24]],baltimore@polygons[[24]],baltimore@polygons[[24]],baltimore@polygons[[24]],baltimore@polygons[[24]],baltimore@polygons[[24]],baltimore@polygons[[24]],
  baltimore@polygons[[25]],baltimore@polygons[[25]],baltimore@polygons[[25]],baltimore@polygons[[25]],baltimore@polygons[[25]],baltimore@polygons[[25]],baltimore@polygons[[25]],
  baltimore@polygons[[26]],baltimore@polygons[[26]],baltimore@polygons[[26]],baltimore@polygons[[26]],baltimore@polygons[[26]],baltimore@polygons[[26]],baltimore@polygons[[26]],
  baltimore@polygons[[27]],baltimore@polygons[[27]],baltimore@polygons[[27]],baltimore@polygons[[27]],baltimore@polygons[[27]],baltimore@polygons[[27]],baltimore@polygons[[27]],
  baltimore@polygons[[28]],baltimore@polygons[[28]],baltimore@polygons[[28]],baltimore@polygons[[28]],baltimore@polygons[[28]],baltimore@polygons[[28]],baltimore@polygons[[28]],
  baltimore@polygons[[29]],baltimore@polygons[[29]],baltimore@polygons[[29]],baltimore@polygons[[29]],baltimore@polygons[[29]],baltimore@polygons[[29]],baltimore@polygons[[29]],
  baltimore@polygons[[30]],baltimore@polygons[[30]],baltimore@polygons[[30]],baltimore@polygons[[30]],baltimore@polygons[[30]],baltimore@polygons[[30]],baltimore@polygons[[30]],
  baltimore@polygons[[31]],baltimore@polygons[[31]],baltimore@polygons[[31]],baltimore@polygons[[31]],baltimore@polygons[[31]],baltimore@polygons[[31]],baltimore@polygons[[31]],
  baltimore@polygons[[32]],baltimore@polygons[[32]],baltimore@polygons[[32]],baltimore@polygons[[32]],baltimore@polygons[[32]],baltimore@polygons[[32]],baltimore@polygons[[32]],
  baltimore@polygons[[33]],baltimore@polygons[[33]],baltimore@polygons[[33]],baltimore@polygons[[33]],baltimore@polygons[[33]],baltimore@polygons[[33]],baltimore@polygons[[33]],
  baltimore@polygons[[34]],baltimore@polygons[[34]],baltimore@polygons[[34]],baltimore@polygons[[34]],baltimore@polygons[[34]],baltimore@polygons[[34]],baltimore@polygons[[34]],
  baltimore@polygons[[35]],baltimore@polygons[[35]],baltimore@polygons[[35]],baltimore@polygons[[35]],baltimore@polygons[[35]],baltimore@polygons[[35]],baltimore@polygons[[35]],
  baltimore@polygons[[36]],baltimore@polygons[[36]],baltimore@polygons[[36]],baltimore@polygons[[36]],baltimore@polygons[[36]],baltimore@polygons[[36]],baltimore@polygons[[36]],
  baltimore@polygons[[37]],baltimore@polygons[[37]],baltimore@polygons[[37]],baltimore@polygons[[37]],baltimore@polygons[[37]],baltimore@polygons[[37]],baltimore@polygons[[37]],
  baltimore@polygons[[38]],baltimore@polygons[[38]],baltimore@polygons[[38]],baltimore@polygons[[38]],baltimore@polygons[[38]],baltimore@polygons[[38]],baltimore@polygons[[38]],
  baltimore@polygons[[39]],baltimore@polygons[[39]],baltimore@polygons[[39]],baltimore@polygons[[39]],baltimore@polygons[[39]],baltimore@polygons[[39]],baltimore@polygons[[39]],
  baltimore@polygons[[40]],baltimore@polygons[[40]],baltimore@polygons[[40]],baltimore@polygons[[40]],baltimore@polygons[[40]],baltimore@polygons[[40]],baltimore@polygons[[40]],
  baltimore@polygons[[41]],baltimore@polygons[[41]],baltimore@polygons[[41]],baltimore@polygons[[41]],baltimore@polygons[[41]],baltimore@polygons[[41]],baltimore@polygons[[41]],
  baltimore@polygons[[42]],baltimore@polygons[[42]],baltimore@polygons[[42]],baltimore@polygons[[42]],baltimore@polygons[[42]],baltimore@polygons[[42]],baltimore@polygons[[42]],
  baltimore@polygons[[43]],baltimore@polygons[[43]],baltimore@polygons[[43]],baltimore@polygons[[43]],baltimore@polygons[[43]],baltimore@polygons[[43]],baltimore@polygons[[43]],
  baltimore@polygons[[44]],baltimore@polygons[[44]],baltimore@polygons[[44]],baltimore@polygons[[44]],baltimore@polygons[[44]],baltimore@polygons[[44]],baltimore@polygons[[44]],
  baltimore@polygons[[45]],baltimore@polygons[[45]],baltimore@polygons[[45]],baltimore@polygons[[45]],baltimore@polygons[[45]],baltimore@polygons[[45]],baltimore@polygons[[45]],
  baltimore@polygons[[46]],baltimore@polygons[[46]],baltimore@polygons[[46]],baltimore@polygons[[46]],baltimore@polygons[[46]],baltimore@polygons[[46]],baltimore@polygons[[46]],
  baltimore@polygons[[47]],baltimore@polygons[[47]],baltimore@polygons[[47]],baltimore@polygons[[47]],baltimore@polygons[[47]],baltimore@polygons[[47]],baltimore@polygons[[47]],
  baltimore@polygons[[48]],baltimore@polygons[[48]],baltimore@polygons[[48]],baltimore@polygons[[48]],baltimore@polygons[[48]],baltimore@polygons[[48]],baltimore@polygons[[48]],
  baltimore@polygons[[49]],baltimore@polygons[[49]],baltimore@polygons[[49]],baltimore@polygons[[49]],baltimore@polygons[[49]],baltimore@polygons[[49]],baltimore@polygons[[49]],
  baltimore@polygons[[50]],baltimore@polygons[[50]],baltimore@polygons[[50]],baltimore@polygons[[50]],baltimore@polygons[[50]],baltimore@polygons[[50]],baltimore@polygons[[50]],
  baltimore@polygons[[51]],baltimore@polygons[[51]],baltimore@polygons[[51]],baltimore@polygons[[51]],baltimore@polygons[[51]],baltimore@polygons[[51]],baltimore@polygons[[51]],
  baltimore@polygons[[52]],baltimore@polygons[[52]],baltimore@polygons[[52]],baltimore@polygons[[52]],baltimore@polygons[[52]],baltimore@polygons[[52]],baltimore@polygons[[52]],
  baltimore@polygons[[53]],baltimore@polygons[[53]],baltimore@polygons[[53]],baltimore@polygons[[53]],baltimore@polygons[[53]],baltimore@polygons[[53]],baltimore@polygons[[53]],
  baltimore@polygons[[54]],baltimore@polygons[[54]],baltimore@polygons[[54]],baltimore@polygons[[54]],baltimore@polygons[[54]],baltimore@polygons[[54]],baltimore@polygons[[54]],
  baltimore@polygons[[55]],baltimore@polygons[[55]],baltimore@polygons[[55]],baltimore@polygons[[55]],baltimore@polygons[[55]],baltimore@polygons[[55]],baltimore@polygons[[55]],
  baltimore@polygons[[56]],baltimore@polygons[[56]],baltimore@polygons[[56]],baltimore@polygons[[56]],baltimore@polygons[[56]],baltimore@polygons[[56]],baltimore@polygons[[56]]))

baltimore_alias <- baltimore

baltimore_alias@polygons <- anim_tibble$geometry

baltimore_alias@data$community %in% anim_tibble$Community #Again, we ensure that we have a perfect match

baltimore_alias@data <-left_join(baltimore_alias@data,anim_tibble,by = c('community' = 'Community'))

bbox <- baltimore@bbox
pb <-  c(0,25,50,75,100,125,150,175,200,225,250)

animated_crime_map <- tm_shape(baltimore_alias,bbox = bbox, projection = crs.geo1) +
  tm_polygons("CrimeRate",breaks=pb) +
  tm_facets(free.scales.fill = F,along = "Year")+tm_shape(baltimore)+tm_borders()

tmap_animation(animated_crime_map, delay=100)

#maybe write.gif

###3.4.1 Calculation of violent crime and property crime evolution

We can make the exact same computation to calculate violent crime and property crime evolution.

Violent_Crime_in_2021 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2021-01-01") & CrimeDateTime <= as.Date("2021-12-31")) %>% filter(VIO_PROP_CFS=="VIOLENT")

Violent_Crime_in_2020 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2020-01-01") & CrimeDateTime <= as.Date("2020-12-31")) %>% filter(VIO_PROP_CFS=="VIOLENT")

Violent_Crime_in_2019 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2019-01-01") & CrimeDateTime <= as.Date("2019-12-31")) %>% filter(VIO_PROP_CFS=="VIOLENT")

Violent_Crime_in_2018 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2018-01-01") & CrimeDateTime <= as.Date("2018-12-31")) %>% filter(VIO_PROP_CFS=="VIOLENT")

Violent_Crime_in_2017 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2017-01-01") & CrimeDateTime <= as.Date("2017-12-31")) %>% filter(VIO_PROP_CFS=="VIOLENT")

Violent_Crime_in_2016 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2016-01-01") & CrimeDateTime <= as.Date("2016-12-31")) %>% filter(VIO_PROP_CFS=="VIOLENT")

Violent_Crime_in_2015 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2015-01-01") & CrimeDateTime <= as.Date("2015-12-31")) %>% filter(VIO_PROP_CFS=="VIOLENT")

Violent_Crime_in_2014 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2014-01-01") & CrimeDateTime <= as.Date("2014-12-31")) %>% filter(VIO_PROP_CFS=="VIOLENT")

ViolentCrimePerCapitaPerArea2021 <- Violent_Crime_in_2021 %>% 
  group_by(Community) %>%
  summarize(ViolentCrimeFrequency21=n())

ViolentCrimePerCapitaPerArea2021 <-  mutate(ViolentCrimePerCapitaPerArea2021,ViolentCrimePer1000inhabitants21=((ViolentCrimePerCapitaPerArea2021$ViolentCrimeFrequency21/population_data$tpop20)*1000))

ViolentCrimePerCapitaPerArea2021 <- rbind(ViolentCrimePerCapitaPerArea2021,list("Unassigned -- Jail",0,0))

ViolentCrimePerCapitaPerArea2020 <- Violent_Crime_in_2020 %>% 
  group_by(Community) %>%
  summarize(ViolentCrimeFrequency20=n())

ViolentCrimePerCapitaPerArea2020 <-  mutate(ViolentCrimePerCapitaPerArea2020,ViolentCrimePer1000inhabitants20=((ViolentCrimePerCapitaPerArea2020$ViolentCrimeFrequency20/population_data$tpop20)*1000))

ViolentCrimePerCapitaPerArea2020 <- rbind(ViolentCrimePerCapitaPerArea2020,list("Unassigned -- Jail",0,0))

ViolentCrimePerCapitaPerArea2019 <- Violent_Crime_in_2019 %>% 
  group_by(Community) %>%
  summarize(ViolentCrimeFrequency19=n())

ViolentCrimePerCapitaPerArea2019 <-  mutate(ViolentCrimePerCapitaPerArea2019,ViolentCrimePer1000inhabitants19=((ViolentCrimePerCapitaPerArea2019$ViolentCrimeFrequency19/population_data$tpop20)*1000))

ViolentCrimePerCapitaPerArea2019 <- rbind(ViolentCrimePerCapitaPerArea2019,list("Unassigned -- Jail",0,0))

ViolentCrimePerCapitaPerArea2018 <- Violent_Crime_in_2018 %>% 
  group_by(Community) %>%
  summarize(ViolentCrimeFrequency18=n())

ViolentCrimePerCapitaPerArea2018 <-  mutate(ViolentCrimePerCapitaPerArea2018,ViolentCrimePer1000inhabitants18=((ViolentCrimePerCapitaPerArea2018$ViolentCrimeFrequency18/population_data$tpop20)*1000))

ViolentCrimePerCapitaPerArea2018 <- rbind(ViolentCrimePerCapitaPerArea2018,list("Unassigned -- Jail",0,0))

ViolentCrimePerCapitaPerArea2017 <- Violent_Crime_in_2017 %>% 
  group_by(Community) %>%
  summarize(ViolentCrimeFrequency17=n())

ViolentCrimePerCapitaPerArea2017 <-  mutate(ViolentCrimePerCapitaPerArea2017,ViolentCrimePer1000inhabitants17=((ViolentCrimePerCapitaPerArea2017$ViolentCrimeFrequency17/population_data$tpop20)*1000))

ViolentCrimePerCapitaPerArea2017 <- rbind(ViolentCrimePerCapitaPerArea2017,list("Unassigned -- Jail",0,0))

ViolentCrimePerCapitaPerArea2016 <- Violent_Crime_in_2016 %>% 
  group_by(Community) %>%
  summarize(ViolentCrimeFrequency16=n())

ViolentCrimePerCapitaPerArea2016 <-  mutate(ViolentCrimePerCapitaPerArea2016,ViolentCrimePer1000inhabitants16=((ViolentCrimePerCapitaPerArea2016$ViolentCrimeFrequency16/population_data$tpop20)*1000))

ViolentCrimePerCapitaPerArea2016 <- rbind(ViolentCrimePerCapitaPerArea2016,list("Unassigned -- Jail",0,0))

ViolentCrimePerCapitaPerArea2015 <- Violent_Crime_in_2015 %>% 
  group_by(Community) %>%
  summarize(ViolentCrimeFrequency15=n())

ViolentCrimePerCapitaPerArea2015 <-  mutate(ViolentCrimePerCapitaPerArea2015,ViolentCrimePer1000inhabitants15=((ViolentCrimePerCapitaPerArea2015$ViolentCrimeFrequency15/population_data$tpop20)*1000))

ViolentCrimePerCapitaPerArea2015 <- rbind(ViolentCrimePerCapitaPerArea2015,list("Unassigned -- Jail",0,0))

ViolentCrimePerCapitaPerArea2014 <- Violent_Crime_in_2014 %>% 
  group_by(Community) %>%
  summarize(ViolentCrimeFrequency14=n())

ViolentCrimePerCapitaPerArea2014 <-  mutate(ViolentCrimePerCapitaPerArea2014,ViolentCrimePer1000inhabitants14=((ViolentCrimePerCapitaPerArea2014$ViolentCrimeFrequency14/population_data$tpop20)*1000))

ViolentCrimePerCapitaPerArea2014 <- rbind(ViolentCrimePerCapitaPerArea2014,list("Unassigned -- Jail",0,0))

Violent_crime_evolution <- ViolentCrimePerCapitaPerArea2021 %>% 
  left_join(ViolentCrimePerCapitaPerArea2020,by="Community") %>% 
  left_join(ViolentCrimePerCapitaPerArea2019,by="Community") %>%
  left_join(ViolentCrimePerCapitaPerArea2018,by="Community") %>%
  left_join(ViolentCrimePerCapitaPerArea2017,by="Community") %>% 
  left_join(ViolentCrimePerCapitaPerArea2016,by="Community") %>% 
  left_join(ViolentCrimePerCapitaPerArea2015,by="Community") %>% 
  left_join(ViolentCrimePerCapitaPerArea2014,by="Community")

Community_data <- Community_data %>% 
  left_join(Violent_crime_evolution,by="Community")

Violent_Crime_Yearly_evolution_map <- crime_data_with_areas %>%
  filter(VIO_PROP_CFS=="VIOLENT") %>% 
  count(year=floor_date(CrimeDateTime,"year")) %>% 
  ggplot(aes(year,n))+geom_line()+
  scale_x_date(limits = c(as.Date("2014-01-01"), as.Date("2020-12-31"))) +
  labs(title = "Overall, violent crime seems to have decreased for the 2017 to 2020 period",x="Year",y="Violent crime occurences")

Violent_Crime_Yearly_evolution_map

#______________

Property_Crime_in_2021 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2021-01-01") & CrimeDateTime <= as.Date("2021-12-31")) %>% filter(VIO_PROP_CFS=="PROPERTY")

Property_Crime_in_2020 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2020-01-01") & CrimeDateTime <= as.Date("2020-12-31")) %>% filter(VIO_PROP_CFS=="PROPERTY")

Property_Crime_in_2019 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2019-01-01") & CrimeDateTime <= as.Date("2019-12-31")) %>% filter(VIO_PROP_CFS=="PROPERTY")

Property_Crime_in_2018 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2018-01-01") & CrimeDateTime <= as.Date("2018-12-31")) %>% filter(VIO_PROP_CFS=="PROPERTY")

Property_Crime_in_2017 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2017-01-01") & CrimeDateTime <= as.Date("2017-12-31")) %>% filter(VIO_PROP_CFS=="PROPERTY")

Property_Crime_in_2016 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2016-01-01") & CrimeDateTime <= as.Date("2016-12-31")) %>% filter(VIO_PROP_CFS=="PROPERTY")

Property_Crime_in_2015 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2015-01-01") & CrimeDateTime <= as.Date("2015-12-31")) %>% filter(VIO_PROP_CFS=="PROPERTY")

Property_Crime_in_2014 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2014-01-01") & CrimeDateTime <= as.Date("2014-12-31")) %>% filter(VIO_PROP_CFS=="PROPERTY")

PropertyCrimePerCapitaPerArea2021 <- Property_Crime_in_2021 %>% 
  group_by(Community) %>%
  summarize(PropertyCrimeFrequency21=n())

PropertyCrimePerCapitaPerArea2021 <-  mutate(PropertyCrimePerCapitaPerArea2021,PropertyCrimePer1000inhabitants21=((PropertyCrimePerCapitaPerArea2021$PropertyCrimeFrequency21/population_data$tpop20)*1000))

PropertyCrimePerCapitaPerArea2021 <- rbind(PropertyCrimePerCapitaPerArea2021,list("Unassigned -- Jail",0,0))

PropertyCrimePerCapitaPerArea2020 <- Property_Crime_in_2020 %>% 
  group_by(Community) %>%
  summarize(PropertyCrimeFrequency20=n())

PropertyCrimePerCapitaPerArea2020 <-  mutate(PropertyCrimePerCapitaPerArea2020,PropertyCrimePer1000inhabitants20=((PropertyCrimePerCapitaPerArea2020$PropertyCrimeFrequency20/population_data$tpop20)*1000))

PropertyCrimePerCapitaPerArea2020 <- rbind(PropertyCrimePerCapitaPerArea2020,list("Unassigned -- Jail",0,0))

PropertyCrimePerCapitaPerArea2019 <- Property_Crime_in_2019 %>% 
  group_by(Community) %>%
  summarize(PropertyCrimeFrequency19=n())

PropertyCrimePerCapitaPerArea2019 <-  mutate(PropertyCrimePerCapitaPerArea2019,PropertyCrimePer1000inhabitants19=((PropertyCrimePerCapitaPerArea2019$PropertyCrimeFrequency19/population_data$tpop20)*1000))

PropertyCrimePerCapitaPerArea2019 <- rbind(PropertyCrimePerCapitaPerArea2019,list("Unassigned -- Jail",0,0))

PropertyCrimePerCapitaPerArea2018 <- Property_Crime_in_2018 %>% 
  group_by(Community) %>%
  summarize(PropertyCrimeFrequency18=n())

PropertyCrimePerCapitaPerArea2018 <-  mutate(PropertyCrimePerCapitaPerArea2018,PropertyCrimePer1000inhabitants18=((PropertyCrimePerCapitaPerArea2018$PropertyCrimeFrequency18/population_data$tpop20)*1000))

PropertyCrimePerCapitaPerArea2018 <- rbind(PropertyCrimePerCapitaPerArea2018,list("Unassigned -- Jail",0,0))

PropertyCrimePerCapitaPerArea2017 <- Property_Crime_in_2017 %>% 
  group_by(Community) %>%
  summarize(PropertyCrimeFrequency17=n())

PropertyCrimePerCapitaPerArea2017 <-  mutate(PropertyCrimePerCapitaPerArea2017,PropertyCrimePer1000inhabitants17=((PropertyCrimePerCapitaPerArea2017$PropertyCrimeFrequency17/population_data$tpop20)*1000))

PropertyCrimePerCapitaPerArea2017 <- rbind(PropertyCrimePerCapitaPerArea2017,list("Unassigned -- Jail",0,0))

PropertyCrimePerCapitaPerArea2016 <- Property_Crime_in_2016 %>% 
  group_by(Community) %>%
  summarize(PropertyCrimeFrequency16=n())

PropertyCrimePerCapitaPerArea2016 <-  mutate(PropertyCrimePerCapitaPerArea2016,PropertyCrimePer1000inhabitants16=((PropertyCrimePerCapitaPerArea2016$PropertyCrimeFrequency16/population_data$tpop20)*1000))

PropertyCrimePerCapitaPerArea2016 <- rbind(PropertyCrimePerCapitaPerArea2016,list("Unassigned -- Jail",0,0))

PropertyCrimePerCapitaPerArea2015 <- Property_Crime_in_2015 %>% 
  group_by(Community) %>%
  summarize(PropertyCrimeFrequency15=n())

PropertyCrimePerCapitaPerArea2015 <-  mutate(PropertyCrimePerCapitaPerArea2015,PropertyCrimePer1000inhabitants15=((PropertyCrimePerCapitaPerArea2015$PropertyCrimeFrequency15/population_data$tpop20)*1000))

PropertyCrimePerCapitaPerArea2015 <- rbind(PropertyCrimePerCapitaPerArea2015,list("Unassigned -- Jail",0,0))

PropertyCrimePerCapitaPerArea2014 <- Property_Crime_in_2014 %>% 
  group_by(Community) %>%
  summarize(PropertyCrimeFrequency14=n())

PropertyCrimePerCapitaPerArea2014 <-  mutate(PropertyCrimePerCapitaPerArea2014,PropertyCrimePer1000inhabitants14=((PropertyCrimePerCapitaPerArea2014$PropertyCrimeFrequency14/population_data$tpop20)*1000))

PropertyCrimePerCapitaPerArea2014 <- rbind(PropertyCrimePerCapitaPerArea2014,list("Unassigned -- Jail",0,0))

Property_crime_evolution <- PropertyCrimePerCapitaPerArea2021 %>% 
  left_join(PropertyCrimePerCapitaPerArea2020,by="Community") %>% 
  left_join(PropertyCrimePerCapitaPerArea2019,by="Community") %>%
  left_join(PropertyCrimePerCapitaPerArea2018,by="Community") %>%
  left_join(PropertyCrimePerCapitaPerArea2017,by="Community") %>% 
  left_join(PropertyCrimePerCapitaPerArea2016,by="Community") %>% 
  left_join(PropertyCrimePerCapitaPerArea2015,by="Community") %>% 
  left_join(PropertyCrimePerCapitaPerArea2014,by="Community")

Community_data <- Community_data %>% 
  left_join(Property_crime_evolution,by="Community")

Property_Crime_Yearly_evolution_map <- crime_data_with_areas %>%
  filter(VIO_PROP_CFS=="PROPERTY") %>% 
  count(year=floor_date(CrimeDateTime,"year")) %>% 
  ggplot(aes(year,n))+geom_line()+
  scale_x_date(limits = c(as.Date("2014-01-01"), as.Date("2020-12-31"))) +
  labs(title = "Overall, property crime seems to have decreased for the 2017 to 2020 period",x="Year",y="Property crime occurences")

Property_Crime_Yearly_evolution_map

###3.4.2 Calculation of felony and misdemeanor evolution

We can make the exact same computation to calculate felony and misdemeanor evolution.

Felony_in_2021 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2021-01-01") & CrimeDateTime <= as.Date("2021-12-31")) %>% filter(Category=="Felony")

Felony_in_2020 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2020-01-01") & CrimeDateTime <= as.Date("2020-12-31")) %>% filter(Category=="Felony")

Felony_in_2019 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2019-01-01") & CrimeDateTime <= as.Date("2019-12-31")) %>% filter(Category=="Felony")

Felony_in_2018 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2018-01-01") & CrimeDateTime <= as.Date("2018-12-31")) %>% filter(Category=="Felony")

Felony_in_2017 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2017-01-01") & CrimeDateTime <= as.Date("2017-12-31")) %>% filter(Category=="Felony")

Felony_in_2016 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2016-01-01") & CrimeDateTime <= as.Date("2016-12-31")) %>% filter(Category=="Felony")

Felony_in_2015 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2015-01-01") & CrimeDateTime <= as.Date("2015-12-31")) %>% filter(Category=="Felony")

Felony_in_2014 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2014-01-01") & CrimeDateTime <= as.Date("2014-12-31")) %>% filter(Category=="Felony")

FelonyPerCapitaPerArea2021 <- Felony_in_2021 %>% 
  group_by(Community) %>%
  summarize(FelonyFrequency21=n())

FelonyPerCapitaPerArea2021 <-  mutate(FelonyPerCapitaPerArea2021,FelonyPer1000inhabitants21=((FelonyPerCapitaPerArea2021$FelonyFrequency21/population_data$tpop20)*1000))

FelonyPerCapitaPerArea2021 <- rbind(FelonyPerCapitaPerArea2021,list("Unassigned -- Jail",0,0))

FelonyPerCapitaPerArea2020 <- Felony_in_2020 %>% 
  group_by(Community) %>%
  summarize(FelonyFrequency20=n())

FelonyPerCapitaPerArea2020 <-  mutate(FelonyPerCapitaPerArea2020,FelonyPer1000inhabitants20=((FelonyPerCapitaPerArea2020$FelonyFrequency20/population_data$tpop20)*1000))

FelonyPerCapitaPerArea2020 <- rbind(FelonyPerCapitaPerArea2020,list("Unassigned -- Jail",0,0))

FelonyPerCapitaPerArea2019 <- Felony_in_2019 %>% 
  group_by(Community) %>%
  summarize(FelonyFrequency19=n())

FelonyPerCapitaPerArea2019 <-  mutate(FelonyPerCapitaPerArea2019,FelonyPer1000inhabitants19=((FelonyPerCapitaPerArea2019$FelonyFrequency19/population_data$tpop20)*1000))

FelonyPerCapitaPerArea2019 <- rbind(FelonyPerCapitaPerArea2019,list("Unassigned -- Jail",0,0))

FelonyPerCapitaPerArea2018 <- Felony_in_2018 %>% 
  group_by(Community) %>%
  summarize(FelonyFrequency18=n())

FelonyPerCapitaPerArea2018 <-  mutate(FelonyPerCapitaPerArea2018,FelonyPer1000inhabitants18=((FelonyPerCapitaPerArea2018$FelonyFrequency18/population_data$tpop20)*1000))

FelonyPerCapitaPerArea2018 <- rbind(FelonyPerCapitaPerArea2018,list("Unassigned -- Jail",0,0))

FelonyPerCapitaPerArea2017 <- Felony_in_2017 %>% 
  group_by(Community) %>%
  summarize(FelonyFrequency17=n())

FelonyPerCapitaPerArea2017 <-  mutate(FelonyPerCapitaPerArea2017,FelonyPer1000inhabitants17=((FelonyPerCapitaPerArea2017$FelonyFrequency17/population_data$tpop20)*1000))

FelonyPerCapitaPerArea2017 <- rbind(FelonyPerCapitaPerArea2017,list("Unassigned -- Jail",0,0))

FelonyPerCapitaPerArea2016 <- Felony_in_2016 %>% 
  group_by(Community) %>%
  summarize(FelonyFrequency16=n())

FelonyPerCapitaPerArea2016 <-  mutate(FelonyPerCapitaPerArea2016,FelonyPer1000inhabitants16=((FelonyPerCapitaPerArea2016$FelonyFrequency16/population_data$tpop20)*1000))

FelonyPerCapitaPerArea2016 <- rbind(FelonyPerCapitaPerArea2016,list("Unassigned -- Jail",0,0))

FelonyPerCapitaPerArea2015 <- Felony_in_2015 %>% 
  group_by(Community) %>%
  summarize(FelonyFrequency15=n())

FelonyPerCapitaPerArea2015 <-  mutate(FelonyPerCapitaPerArea2015,FelonyPer1000inhabitants15=((FelonyPerCapitaPerArea2015$FelonyFrequency15/population_data$tpop20)*1000))

FelonyPerCapitaPerArea2015 <- rbind(FelonyPerCapitaPerArea2015,list("Unassigned -- Jail",0,0))

FelonyPerCapitaPerArea2014 <- Felony_in_2014 %>% 
  group_by(Community) %>%
  summarize(FelonyFrequency14=n())

FelonyPerCapitaPerArea2014 <-  mutate(FelonyPerCapitaPerArea2014,FelonyPer1000inhabitants14=((FelonyPerCapitaPerArea2014$FelonyFrequency14/population_data$tpop20)*1000))

FelonyPerCapitaPerArea2014 <- rbind(FelonyPerCapitaPerArea2014,list("Unassigned -- Jail",0,0))

Felony_evolution <- FelonyPerCapitaPerArea2021 %>% 
  left_join(FelonyPerCapitaPerArea2020,by="Community") %>% 
  left_join(FelonyPerCapitaPerArea2019,by="Community") %>%
  left_join(FelonyPerCapitaPerArea2018,by="Community") %>%
  left_join(FelonyPerCapitaPerArea2017,by="Community") %>% 
  left_join(FelonyPerCapitaPerArea2016,by="Community") %>% 
  left_join(FelonyPerCapitaPerArea2015,by="Community") %>% 
  left_join(FelonyPerCapitaPerArea2014,by="Community")

Community_data <- Community_data %>% 
  left_join(Felony_evolution,by="Community")

Felony_Yearly_evolution_map <- crime_data_with_areas %>%
  filter(Category=="Felony") %>% 
  count(year=floor_date(CrimeDateTime,"year")) %>% 
  ggplot(aes(year,n))+geom_line()+
  scale_x_date(limits = c(as.Date("2014-01-01"), as.Date("2020-12-31"))) +
  labs(title = "In Baltimore, Felony started to decrease as from 2017",x="Year",y="Felony occurences")

Felony_Yearly_evolution_map

#___________

Misdemeanor_in_2021 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2021-01-01") & CrimeDateTime <= as.Date("2021-12-31")) %>% filter(Category=="Misdemeanor")

Misdemeanor_in_2020 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2020-01-01") & CrimeDateTime <= as.Date("2020-12-31")) %>% filter(Category=="Misdemeanor")

Misdemeanor_in_2019 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2019-01-01") & CrimeDateTime <= as.Date("2019-12-31")) %>% filter(Category=="Misdemeanor")

Misdemeanor_in_2018 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2018-01-01") & CrimeDateTime <= as.Date("2018-12-31")) %>% filter(Category=="Misdemeanor")

Misdemeanor_in_2017 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2017-01-01") & CrimeDateTime <= as.Date("2017-12-31")) %>% filter(Category=="Misdemeanor")

Misdemeanor_in_2016 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2016-01-01") & CrimeDateTime <= as.Date("2016-12-31")) %>% filter(Category=="Misdemeanor")

Misdemeanor_in_2015 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2015-01-01") & CrimeDateTime <= as.Date("2015-12-31")) %>% filter(Category=="Misdemeanor")

Misdemeanor_in_2014 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2014-01-01") & CrimeDateTime <= as.Date("2014-12-31")) %>% filter(Category=="Misdemeanor")

MisdemeanorPerCapitaPerArea2021 <- Misdemeanor_in_2021 %>% 
  group_by(Community) %>%
  summarize(MisdemeanorFrequency21=n())

MisdemeanorPerCapitaPerArea2021 <-  mutate(MisdemeanorPerCapitaPerArea2021,MisdemeanorPer1000inhabitants21=((MisdemeanorPerCapitaPerArea2021$MisdemeanorFrequency21/population_data$tpop20)*1000))

MisdemeanorPerCapitaPerArea2021 <- rbind(MisdemeanorPerCapitaPerArea2021,list("Unassigned -- Jail",0,0))

MisdemeanorPerCapitaPerArea2020 <- Misdemeanor_in_2020 %>% 
  group_by(Community) %>%
  summarize(MisdemeanorFrequency20=n())

MisdemeanorPerCapitaPerArea2020 <-  mutate(MisdemeanorPerCapitaPerArea2020,MisdemeanorPer1000inhabitants20=((MisdemeanorPerCapitaPerArea2020$MisdemeanorFrequency20/population_data$tpop20)*1000))

MisdemeanorPerCapitaPerArea2020 <- rbind(MisdemeanorPerCapitaPerArea2020,list("Unassigned -- Jail",0,0))

MisdemeanorPerCapitaPerArea2019 <- Misdemeanor_in_2019 %>% 
  group_by(Community) %>%
  summarize(MisdemeanorFrequency19=n())

MisdemeanorPerCapitaPerArea2019 <-  mutate(MisdemeanorPerCapitaPerArea2019,MisdemeanorPer1000inhabitants19=((MisdemeanorPerCapitaPerArea2019$MisdemeanorFrequency19/population_data$tpop20)*1000))

MisdemeanorPerCapitaPerArea2019 <- rbind(MisdemeanorPerCapitaPerArea2019,list("Unassigned -- Jail",0,0))

MisdemeanorPerCapitaPerArea2018 <- Misdemeanor_in_2018 %>% 
  group_by(Community) %>%
  summarize(MisdemeanorFrequency18=n())

MisdemeanorPerCapitaPerArea2018 <-  mutate(MisdemeanorPerCapitaPerArea2018,MisdemeanorPer1000inhabitants18=((MisdemeanorPerCapitaPerArea2018$MisdemeanorFrequency18/population_data$tpop20)*1000))

MisdemeanorPerCapitaPerArea2018 <- rbind(MisdemeanorPerCapitaPerArea2018,list("Unassigned -- Jail",0,0))

MisdemeanorPerCapitaPerArea2017 <- Misdemeanor_in_2017 %>% 
  group_by(Community) %>%
  summarize(MisdemeanorFrequency17=n())

MisdemeanorPerCapitaPerArea2017 <-  mutate(MisdemeanorPerCapitaPerArea2017,MisdemeanorPer1000inhabitants17=((MisdemeanorPerCapitaPerArea2017$MisdemeanorFrequency17/population_data$tpop20)*1000))

MisdemeanorPerCapitaPerArea2017 <- rbind(MisdemeanorPerCapitaPerArea2017,list("Unassigned -- Jail",0,0))

MisdemeanorPerCapitaPerArea2016 <- Misdemeanor_in_2016 %>% 
  group_by(Community) %>%
  summarize(MisdemeanorFrequency16=n())

MisdemeanorPerCapitaPerArea2016 <-  mutate(MisdemeanorPerCapitaPerArea2016,MisdemeanorPer1000inhabitants16=((MisdemeanorPerCapitaPerArea2016$MisdemeanorFrequency16/population_data$tpop20)*1000))

MisdemeanorPerCapitaPerArea2016 <- rbind(MisdemeanorPerCapitaPerArea2016,list("Unassigned -- Jail",0,0))

MisdemeanorPerCapitaPerArea2015 <- Misdemeanor_in_2015 %>% 
  group_by(Community) %>%
  summarize(MisdemeanorFrequency15=n())

MisdemeanorPerCapitaPerArea2015 <-  mutate(MisdemeanorPerCapitaPerArea2015,MisdemeanorPer1000inhabitants15=((MisdemeanorPerCapitaPerArea2015$MisdemeanorFrequency15/population_data$tpop20)*1000))

MisdemeanorPerCapitaPerArea2015 <- rbind(MisdemeanorPerCapitaPerArea2015,list("Unassigned -- Jail",0,0))

MisdemeanorPerCapitaPerArea2014 <- Misdemeanor_in_2014 %>% 
  group_by(Community) %>%
  summarize(MisdemeanorFrequency14=n())

MisdemeanorPerCapitaPerArea2014 <-  mutate(MisdemeanorPerCapitaPerArea2014,MisdemeanorPer1000inhabitants14=((MisdemeanorPerCapitaPerArea2014$MisdemeanorFrequency14/population_data$tpop20)*1000))

MisdemeanorPerCapitaPerArea2014 <- rbind(MisdemeanorPerCapitaPerArea2014,list("Unassigned -- Jail",0,0))

Misdemeanor_evolution <- MisdemeanorPerCapitaPerArea2021 %>% 
  left_join(MisdemeanorPerCapitaPerArea2020,by="Community") %>% 
  left_join(MisdemeanorPerCapitaPerArea2019,by="Community") %>%
  left_join(MisdemeanorPerCapitaPerArea2018,by="Community") %>%
  left_join(MisdemeanorPerCapitaPerArea2017,by="Community") %>% 
  left_join(MisdemeanorPerCapitaPerArea2016,by="Community") %>% 
  left_join(MisdemeanorPerCapitaPerArea2015,by="Community") %>% 
  left_join(MisdemeanorPerCapitaPerArea2014,by="Community")

Community_data <- Community_data %>% 
  left_join(Misdemeanor_evolution,by="Community")

Misdemeanor_Yearly_evolution_map <- crime_data_with_areas %>%
  filter(Category=="Misdemeanor") %>% 
  count(year=floor_date(CrimeDateTime,"year")) %>% 
  ggplot(aes(year,n))+geom_line()+
  scale_x_date(limits = c(as.Date("2014-01-01"), as.Date("2020-12-31"))) +
  labs(title = "In Baltimore, Misdemeanor started to decrease as from 2017",x="Year",y="Misdemeanor occurences")

Misdemeanor_Yearly_evolution_map

3.5 Internet access and crimes

First, we need to merge the data in one big file. Here every datasets needs one colums which is named in the same way (e.g. Community). Here we create a table with the following colums: Community Statistical Area, Internet accessibility CSA,CCTV per area, CrimeStatsPerArea,crime_data_with_areas, FelonyStats, MisdemeanorStats, . Merging these files in one excel will help up to get a overview table and enable us to do some regressions to generate meaningful output of it. Here we simply merge the files by their same column, namely the community statistical area.

Analysis

4.1 CCTVs VS Crime - Does the presence of CCTV deter crime?

In order to be in position to comment on the effectiveness of CCTVs on crime deterrence, one is first going to investigate the relationship between CCTV density and crime per capita. To do so, we first create a simple linear regression model. We create a new data frame called CCTV_VS_crimes (which basically is a left joint). The linear regression indicates a moderate correlation between higher CCTV density and higher crime per capita. The \(R^2\) is at 42.9%. Plotting the observations enables one to see this tendency. The blue line represents the regression line.

Regression Crime vs CCTVs
Dependent variable:
CrimePer1000inhabitants
density_perc 91.200***
(14.300)
Constant 463.000***
(42.000)
Observations 56
R2 0.429
Adjusted R2 0.419
Residual Std. Error 250.000 (df = 54)
F Statistic 40.600*** (df = 1; 54)
Note: p<0.1; p<0.05; p<0.01

4.1.1 Mapping of CCTVs and crime, felony and misdemeanor per capita

In these section we engage with the mapping of the CCTVs and crimes. The method is the same as before with the tmap package. However, this time we have two different shapes: tm_shape(baltimore) which constitutes the base map and tm_shape(balt_dat) which adds a layer containing points. If we take a look at this map we see that it gives an intuition about the phenomenon we illustrated before. It seems as if where crime per capita is the lowest, there seems to be less CCTVs (for instance in the north area of the city or even in the western areas). There seems to be a correlation between the dark red areas and the CCTV locations.

Crime_and_CCTV_map <- tm_shape(baltimore) + tm_fill(col = "CrimePer1000inhabitants", title ="Crime per capita per Area in %",style = "quantile") + tm_borders(col="black",alpha=0.3) + tm_layout(inner.margins = 0.05)+ tm_shape(balt_dat) + tm_dots(col="black")

Felony_and_CCTV_map <- tm_shape(baltimore) + tm_fill(col = "FelonyPerCapitaPerArea", title ="Felony per capita per Area in %", style = "quantile") + tm_borders(col="black",alpha=0.3)+ tm_layout(inner.margins = 0.05) + tm_shape(balt_dat) + tm_dots(col="black")

Misdemeanor_and_CCTV_map <- tm_shape(baltimore) + tm_fill(col = "MisdemeanorPerCapitaPerArea", title ="Misdemeanor per capita per Area in %",style = "quantile") + tm_borders(col="black",alpha=0.3)+ tm_layout(inner.margins = 0.05) + tm_shape(balt_dat) + tm_dots(col="black")

tmap_mode("view") #Use this command to have interactive maps

baltimore@data[["fid"]]<-baltimore@data[["community"]] #We do that so that we see the name of the Community when using an interactive map

tmap_arrange(Crime_and_CCTV_map,Felony_and_CCTV_map,Misdemeanor_and_CCTV_map)























This map shows quite well that CCTV placement seems to follow the areas where crime per capita is the highest. Looking at the north-western and south-western areas of the map, it can be seen that the placement of CCTVs aligns rather well with the areas considered dangerous.

Crime_per_capita_VS_CCTV_map <- tm_shape(baltimore) + tm_fill(col = "CrimePer1000inhabitants", title ="Crime per capita",style = "quantile") + tm_borders(col="black",alpha=0.3) + tm_layout(inner.margins = 0.05) + tm_shape(balt_dat) + tm_dots(col="black") 

tmap_mode("plot")
Crime_per_capita_VS_CCTV_map

However, even though this correlation is interesting, it does not allow one to say much about the effectiveness of CCTVs. We are stuck in a sort of chicken-and-egg problem. In order to get in position to comment on the crime-deterring potential of CCTV, we decided to analyse the crime per capita evolution of certain areas in relation with their respective CCTV density. In order to select our areas of interest, we performed a k-means clustering based on two dimensions: CCTV density and crime per capita. We first scale the data, then we add the names of each area as row name. We then use the elbow method to determine the optimal number of cluster using fviz_nbclust. When performing the kmeans clustering, we specify a nstart parameter. Indeed, if we fail to do so, there is only one choice of random set of rows chosen in the data set as initial centers. We end up with 5 clusters that we represent using fviz_cluster. The ratio between the between sum of square and the within sum of square is good, the higher this value the better. Indeed, we want the variation to come from between groups and not within groups.

library(FactoMineR)
library(factoextra)

sc.Community_data_Clustering <- scale(Community_data[,c(2:89)]) #We scale the data

row.names(sc.Community_data_Clustering) <- as.vector(t(Community_data[,1])) #We add names of area for each row

fviz_nbclust(sc.Community_data_Clustering[,c(1,3)], kmeans, method="wss")+
  geom_vline(xintercept = 5, linetype = 2) + # add line for better visualisation
  labs(subtitle = "Elbow method")  #We can determine the optimal number of cluster, 5 clusters seems to be reasonable

set.seed(10)
km.clust2 <- kmeans(sc.Community_data_Clustering[,c(1,3)], 5, nstart = 25)
print(km.clust2)
fviz_cluster(km.clust2, data=sc.Community_data_Clustering[,c(1,3)], repel=TRUE)+labs(x="Crime per capita",y="CCTV density",title = str_wrap(("With 5 clusters, we have a rather good ratio of 85% between between and within SS"),width=80),subtitle=str_wrap(("We will focus on areas in the high crime, high CCTV density cluster as well as those in the lower crime, high CCTV density cluster"),width=90))

We will start analysing how crime per capita evolved in areas belonging to the “high crime, high CCTV density” cluster. On top of graphically representing this evolution, one may also compute the percentage change in crime per capita betwen 2014 and 2019.

Downtown_Seton_Hill_evolution <- as.vector(t(Community_data[14,c(25,23,21,19,17,15,13)]))

Year <- c(2014:2020)

Downtown_Seton_Hill <- data.frame(Downtown_Seton_Hill_evolution,Year)

Downtown_Seton_Hill_map <- ggplot(Downtown_Seton_Hill,aes(x=Year,y=Downtown_Seton_Hill_evolution)) + 
  geom_line() +
  scale_y_continuous(limits = c(0, 250), breaks =seq(0,250,50)) + 
  labs(title = "Downtown/Seton Hill",x="Year",y="Crime per capita")

Oldtown_Middle_East_evolution <- as.vector(t(Community_data[41,c(25,23,21,19,17,15,13)]))

Oldtown_Middle_East <- data.frame(Oldtown_Middle_East_evolution,Year)

Oldtown_Middle_East_map <- ggplot(Oldtown_Middle_East,aes(x=Year,y=Oldtown_Middle_East_evolution)) + 
  geom_line() +
  scale_y_continuous(limits = c(0, 250), breaks =seq(0,250,50)) + 
  labs(title = "Oldtown/Middle East",x="Year",y="Crime per capita")

Sandtown_Winchester_Harlem_Park_evolution <- as.vector(t(Community_data[47,c(25,23,21,19,17,15,13)]))

Sandtown_Winchester_Harlem_Park <- data.frame(Sandtown_Winchester_Harlem_Park_evolution,Year)

Sandtown_Winchester_Harlem_Park_map <- ggplot(Sandtown_Winchester_Harlem_Park,aes(x=Year,y=Sandtown_Winchester_Harlem_Park_evolution)) + 
  geom_line() +
  scale_y_continuous(limits = c(0, 250), breaks =seq(0,250,50)) +
  labs(title = "Sandtown-Winchester/Harlem Park",x="Year",y="Crime per capita")

Cherry_Hill_evolution <- as.vector(t(Community_data[7,c(25,23,21,19,17,15,13)]))

Cherry_Hill <- data.frame(Cherry_Hill_evolution,Year)

Cherry_Hill_map <- ggplot(Cherry_Hill,aes(x=Year,y=Cherry_Hill_evolution)) + 
  geom_line() +
  scale_y_continuous(limits = c(0, 250), breaks =seq(0,250,50)) +
  labs(title = "Cherry Hill",x="Year",y="Crime per capita")

library(pdp)

grid.arrange(Downtown_Seton_Hill_map,Oldtown_Middle_East_map,Sandtown_Winchester_Harlem_Park_map,Cherry_Hill_map,nrow=2,ncol=2)

Crime_Evolution_VS_CCTV <- Community_data[,c(1,25,15,4)] %>% 
  mutate(change_perc=((Community_data$CrimePer1000inhabitants19/Community_data$CrimePer1000inhabitants14)-1)*100)

Crime_Evolution_VS_CCTV[c(7,14,41,47),c(-3)]
#> # A tibble: 4 x 4
#>   Community             CrimePer1000inhabit~ density_perc change_perc
#>   <chr>                                <dbl>        <dbl>       <dbl>
#> 1 Cherry Hill                           84.8         7.06      -0.157
#> 2 Downtown/Seton Hill                  169.         10.0       39.0  
#> 3 Oldtown/Middle East                  119.          7.66      25.2  
#> 4 Sandtown-Winchester/~                117.          7.42      -8.10

For the sake of our analysis we will only consider the 2014-2019 period. Indeed, it seems reasonable to assume that the decrease one observes in most areas for the 2020 period can be attributed to the Covid-19 pandemic. One can see that for two out of these four areas with high crime and high CCTV density, crime per capita has increased. It slightly decreased for Cherry Hill and decreased more significantly for Sandtown-Winchester/Harlem Park. In order to be able to make a comment on CCTV effectiveness, we decided to also analyse how crime pr capita evolved in areas with high crime and low CCTV density.

Washington_Village_Pigtown_evolution <- as.vector(t(Community_data[54,c(25,23,21,19,17,15,13)]))

Washington_Village_Pigtown_ <- data.frame(Washington_Village_Pigtown_evolution,Year)

Washington_Village_Pigtown_map <- ggplot(Washington_Village_Pigtown_,aes(x=Year,y=Washington_Village_Pigtown_evolution)) + 
  geom_line() +
  scale_y_continuous(limits = c(0, 250), breaks =seq(0,250,50)) +
  labs(title = " Washington Village/Pigtown",x="Year",y="Crime per capita")

Harbor_East_Little_Italy_evolution <- as.vector(t(Community_data[26,c(25,23,21,19,17,15,13)]))

Harbor_East_Little_Italy <- data.frame(Harbor_East_Little_Italy_evolution,Year)

Harbor_East_Little_Italy_map <- ggplot(Harbor_East_Little_Italy,aes(x=Year,y=Harbor_East_Little_Italy_evolution)) + 
  geom_line() +
  scale_y_continuous(limits = c(0, 250), breaks =seq(0,250,50)) +
  labs(title = "Harbor East/Little Italy",x="Year",y="Crime per capita")

Madison_East_End_evolution <- as.vector(t(Community_data[33,c(25,23,21,19,17,15,13)]))

Madison_East_End <- data.frame(Madison_East_End_evolution,Year)

Madison_East_End_map <- ggplot(Madison_East_End,aes(x=Year,y=Madison_East_End_evolution)) + 
  geom_line() +
  scale_y_continuous(limits = c(0, 250), breaks =seq(0,250,50)) +
  labs(title ="Madison East End",x="Year",y="Crime per capita")

Southwest_Baltimore_evolution <- as.vector(t(Community_data[51,c(25,23,21,19,17,15,13)]))

Southwest_Baltimore <- data.frame(Southwest_Baltimore_evolution,Year)

Southwest_Baltimore_map <- ggplot(Southwest_Baltimore,aes(x=Year,y=Southwest_Baltimore_evolution)) + 
  geom_line() +
  scale_y_continuous(limits = c(0, 250), breaks =seq(0,250,50)) +
  labs(title ="Southwest Baltimore",x="Year",y="Crime per capita")


grid.arrange(Washington_Village_Pigtown_map,Harbor_East_Little_Italy_map,Madison_East_End_map,Southwest_Baltimore_map,nrow=2,ncol=2)

Crime_Evolution_VS_CCTV[c(26,33,51,54),c(-3)]
#> # A tibble: 4 x 4
#>   Community                  CrimePer1000inh~ density_perc change_perc
#>   <chr>                                 <dbl>        <dbl>       <dbl>
#> 1 Harbor East/Little Italy               173.         3.47       -27.3
#> 2 Madison/East End                       131.         3.59        12.9
#> 3 Southwest Baltimore                    115.         5.26        20.8
#> 4 Washington Village/Pigtown             167.         3.71        11.9

When comparing the four members of the high crime, high CCTV density cluster with the four members of the high crime, low CCTV density cluster, we can note that for all members of the low CCTV density cluster, crime per capita has increased while crime per capita has decreased for two members of the high CCTV density cluster. However, it is crucial to note that these observations have no statistical significance. The decrease we observe in Cherry Hill and Sandtown-Winchester/Harlem Park could be the result of pure luck/coincidence or could be attributed to many other factory. It is, for example, important to note that we observe a decreasing tendency when looking at the overall yearly crime evolution. Another important point to be made is that we observe no correlation between CCTV density and crime per capita change

Crime_Yearly_evolution_map <- crime_data_with_areas %>% 
  count(year=floor_date(CrimeDateTime,"year")) %>% 
  ggplot(aes(year,n))+geom_line()+
  scale_x_date(limits = c(as.Date("2014-01-01"), as.Date("2020-12-31"))) +
  labs(title = "Overall, crime seems to have decreased for 2017 to 2020 period",subtitle=str_wrap(("This explains why we should be careful when considering CCTV effectiveness"),width=80),x="Year",y="Crime occurences")

Crime_Yearly_evolution_map

regression7 <- lm(Crime_Evolution_VS_CCTV$change_perc~Crime_Evolution_VS_CCTV$density_perc)
summary(regression7)
#> 
#> Call:
#> lm(formula = Crime_Evolution_VS_CCTV$change_perc ~ Crime_Evolution_VS_CCTV$density_perc)
#> 
#> Residuals:
#>    Min     1Q Median     3Q    Max 
#> -41.86 -12.76  -0.48  12.05  66.52 
#> 
#> Coefficients:
#>                                      Estimate Std. Error t value
#> (Intercept)                             0.528      3.334    0.16
#> Crime_Evolution_VS_CCTV$density_perc    1.513      1.126    1.34
#>                                      Pr(>|t|)
#> (Intercept)                              0.87
#> Crime_Evolution_VS_CCTV$density_perc     0.18
#> 
#> Residual standard error: 19.6 on 53 degrees of freedom
#>   (1 observation deleted due to missingness)
#> Multiple R-squared:  0.033,  Adjusted R-squared:  0.0147 
#> F-statistic: 1.81 on 1 and 53 DF,  p-value: 0.185

4.1.2 Analysis of where crime took place: August 2021

Another way to consider CCTVs crime-deterring potential is to spatially locate crime and compare it to CCTV location. We know that CCTVs capture activities within 256ft (~2 blocks). We will only select crime committed in August 2021 to have intereprtable data (choosing a larger time frame would make the map unreadable). We choose August 2021 because it is the latest full month which we have in our data set. Taking the latest time points from the data assures us that most of the CCTVs presented in the data set were already there (since we have no information of when exactly these CCTVs were added). Again, as before, we create a data table, assign coordinates, define CRS (in this case the CRS is “EPDS4326”, which we needed to transform using spTransform). Again, we create a map with tm_shape to visualise the results. The output shows where crime takes place compared to the CCTV location. By zooming on the map, we see that some crimes are committed directly in front of CCTVs. Although this is not conclusive evidence, this observation goes against the idea that CCTVs are effective crime deterrents.

crime_spatial <- as.data.table(crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2021-08-01") & CrimeDateTime <= as.Date("2021-08-31")))
coordinates(crime_spatial) <-  c("Longitude","Latitude")
proj4string(crime_spatial) <-  CRS("+init=epsg:4326")
crime_spatial <- spTransform(crime_spatial,crs.geo1)

August21Crimes_VS_CCTV <- tm_shape(baltimore) + tm_borders(col="black",alpha=0.3) + tm_layout(inner.margins = 0.1, title="Crimes committed in August 2021 VS CCTV location",frame.lwd = 5)+ tm_shape(balt_dat) + tm_dots(col="black")+tm_shape(crime_spatial)+tm_dots(col="red",alpha=0.5)

#It could be interesting to see where crime took place relative to CCTV locations in the area with the highest crime rate in August 2021

tmap_mode("view") #Use this command to have interactive maps
August21Crimes_VS_CCTV

To further prove that point we decided to focus on one specific area. We arbitrarily decided to focus on the area which had the highest crime incidence in August. Thus, we needed to calculate the crime rate for August per area to see where crime was highest. The results show that they are in Downtown. So following this we take a closer look at the Downtown area.

CrimePerCapitaPerAreaAugust2021 <- crime_data_with_areas %>%  filter(CrimeDateTime >= as.Date("2021-08-01") & CrimeDateTime <= as.Date("2021-08-31")) %>% 
  group_by(Community) %>%
  summarize(CrimeFrequency=n())

CrimePerCapitaPerAreaAugust2021 <-mutate(CrimePerCapitaPerAreaAugust2021,CrimePer1000inhabitants=((CrimePerCapitaPerAreaAugust2021$CrimeFrequency/population_data$tpop20)*1000))
#We see that Downtown is the area with the highest crime rate in August 2021, we might want to focus on that area and see whether there is crime that take place directly next to CCTVs

In order to create a “sub-map”, we create a smaller area using the st-bbox function. The values indicated in the function represent the most extreme values on the x-axis and y-axis of a map using the EPSG:3857 WGS 84 / Pseudo-Mercator coordinate system. Then, using tm_shape with the new spatial file called Downtown_areaas argument, we create a map in the same way as we have done before. As we want to be able to locate this smaller area in the picture map of baltimore, we must also create a Baltimore map with a rectangle representing the newly created “sub-area”. In order to combine these two maps together, we run the two last lines together and use the viewport function. The output is a zoom on the desired area combined with the bigger map having a rectangle over the area which we are looking at and analyzing. In Downtown, we quite clearly see that some crimes are committed right next to some CCTVs.

Downtown_area <-  st_bbox(c(xmin = -8531335.08, xmax = -8526873.06,
                      ymin =4765236.47, ymax = 4762527.65),
                    crs = st_crs(baltimore)) %>% st_as_sfc()
 
Downtown_map <- tm_shape(Downtown_area) + tm_borders(col="white")+ tm_shape(baltimore) + tm_borders(col="black") + tm_layout(inner.margins = 0.05,frame.lwd = 5,title = "Zoom on Downtown Area",title.position = c('left', 'top'))+tm_scale_bar(position = c("left", "top"))+ tm_shape(balt_dat) + tm_symbols(shape = 2, col = "black", size = 0.07)+tm_shape(crime_spatial)+tm_dots(col="red")

Baltimore_map_2 <- tm_shape(baltimore) + tm_borders()+ tm_shape(Downtown_area) + tm_borders(lwd = 1.5,col = "red") + tm_layout(frame.lwd = 6,inner.margins = 0.05)

tmap_mode("plot")
Downtown_map
print(Baltimore_map_2, vp = viewport(0.8, 0.27, width = 0.5, height = 0.5)) #By running these two lines together, we obtain the map with an additional overview

4.2 What types of crimes may be deterred by surveillance cameras?

The logic here is the same as in section 4.1, except that here,we want to see whether the presence of CCTV can deter a certain type of crime in particular. We start with felonies and misdemeanors, then we analyse violent and property crimes.

4.2.1 CCTVs VS Felonies and Misdemeanors

The results of the simple linear regression shows a weak \(R^2\) for both felonies and misdemeanors and felonies. It therefore does not seem like the presence of CCTV has a particularly strong impact on a certain type of crime.

#> 
#> CCTV vs FelonyPerCapitaPerArea
#> ===============================================
#>                         Dependent variable:    
#>                     ---------------------------
#>                       FelonyPerCapitaPerArea   
#> -----------------------------------------------
#> density_perc                 39.600***         
#>                               (6.880)          
#>                                                
#> Constant                    218.000***         
#>                              (20.200)          
#>                                                
#> -----------------------------------------------
#> Observations                    56             
#> R2                             0.380           
#> Adjusted R2                    0.369           
#> Residual Std. Error      120.000 (df = 54)     
#> F Statistic           33.100*** (df = 1; 54)   
#> ===============================================
#> Note:               *p<0.1; **p<0.05; ***p<0.01
#> 
#> CCTV vs MisdemeanorperCapitaPerArea
#> ===============================================
#>                         Dependent variable:    
#>                     ---------------------------
#>                     MisdemeanorPerCapitaPerArea
#> -----------------------------------------------
#> density_perc                 51.600***         
#>                               (8.780)          
#>                                                
#> Constant                    245.000***         
#>                              (25.800)          
#>                                                
#> -----------------------------------------------
#> Observations                    56             
#> R2                             0.390           
#> Adjusted R2                    0.379           
#> Residual Std. Error      153.000 (df = 54)     
#> F Statistic           34.500*** (df = 1; 54)   
#> ===============================================
#> Note:               *p<0.1; **p<0.05; ***p<0.01

We are in the exact same situation as with crime, in order to comment CCTV effectiveness on felonies and/or misdemeanors, we must analyse felony/misdemeanor evolution over time in areas with/without CCTVs. We start with felonies.

#> K-means clustering with 5 clusters of sizes 16, 8, 6, 7, 19
#> 
#> Cluster means:
#>   FelonyPerCapitaPerArea density_perc
#> 1                 0.1488       -0.533
#> 2                -0.0242        0.626
#> 3                 1.2195        2.299
#> 4                 1.5187        0.367
#> 5                -1.0597       -0.676
#> 
#> Clustering vector:
#>         Allendale/Irvington/S. Hilton 
#>                                     1 
#>       Beechfield/Ten Hills/West Hills 
#>                                     5 
#>                         Belair-Edison 
#>                                     1 
#>     Brooklyn/Curtis Bay/Hawkins Point 
#>                                     1 
#>                                Canton 
#>                                     1 
#>                     Cedonia/Frankford 
#>                                     1 
#>                           Cherry Hill 
#>                                     3 
#>             Chinquapin Park/Belvedere 
#>                                     5 
#>                   Claremont/Armistead 
#>                                     5 
#>                         Clifton-Berea 
#>                                     4 
#>               Cross-Country/Cheswolde 
#>                                     5 
#>              Dickeyville/Franklintown 
#>                                     5 
#>                  Dorchester/Ashburton 
#>                                     1 
#>                   Downtown/Seton Hill 
#>                                     3 
#>                     Edmondson Village 
#>                                     1 
#>                           Fells Point 
#>                                     2 
#>                  Forest Park/Walbrook 
#>                                     1 
#>                        Glen-Fallstaff 
#>                                     1 
#>       Greater Charles Village/Barclay 
#>                                     2 
#>                        Greater Govans 
#>                                     5 
#>                     Greater Mondawmin 
#>                                     4 
#>       Greater Roland Park/Poplar Hill 
#>                                     5 
#>                      Greater Rosemont 
#>                                     2 
#>                       Greenmount East 
#>                                     2 
#>                              Hamilton 
#>                                     5 
#>              Harbor East/Little Italy 
#>                                     2 
#>                      Harford/Echodale 
#>                                     5 
#>                          Highlandtown 
#>                                     5 
#>            Howard Park/West Arlington 
#>                                     1 
#>             Inner Harbor/Federal Hill 
#>                                     2 
#>                            Lauraville 
#>                                     5 
#>                            Loch Raven 
#>                                     5 
#>                      Madison/East End 
#>                                     4 
#>  Medfield/Hampden/Woodberry/Remington 
#>                                     5 
#>                               Midtown 
#>                                     2 
#>                     Midway/Coldstream 
#>                                     4 
#>              Morrell Park/Violetville 
#>                                     1 
#>           Mount Washington/Coldspring 
#>                                     5 
#>     North Baltimore/Guilford/Homeland 
#>                                     5 
#>                             Northwood 
#>                                     5 
#>                   Oldtown/Middle East 
#>                                     3 
#>         Orangeville/East Highlandtown 
#>                                     1 
#>           Patterson Park North & East 
#>                                     5 
#>             Penn North/Reservoir Hill 
#>                                     1 
#>             Pimlico/Arlington/Hilltop 
#>                                     4 
#> Poppleton/The Terraces/Hollins Market 
#>                                     4 
#>       Sandtown-Winchester/Harlem Park 
#>                                     3 
#>                       South Baltimore 
#>                                     5 
#>                          Southeastern 
#>                                     1 
#>                 Southern Park Heights 
#>                                     2 
#>                   Southwest Baltimore 
#>                                     3 
#>                         The Waverlies 
#>                                     1 
#>                   Upton/Druid Heights 
#>                                     3 
#>            Washington Village/Pigtown 
#>                                     4 
#>        Westport/Mount Winans/Lakeland 
#>                                     1 
#>                    Unassigned -- Jail 
#>                                     5 
#> 
#> Within cluster sum of squares by cluster:
#> [1] 3.52 2.46 3.32 1.96 2.96
#>  (between_SS / total_SS =  87.1 %)
#> 
#> Available components:
#> 
#> [1] "cluster"      "centers"      "totss"        "withinss"    
#> [5] "tot.withinss" "betweenss"    "size"         "iter"        
#> [9] "ifault"
#> # A tibble: 4 x 4
#>   Community            FelonyPer1000inhabit~ density_perc change_perc
#>   <chr>                                <dbl>        <dbl>       <dbl>
#> 1 Cherry Hill                           43.9         7.06       0.305
#> 2 Downtown/Seton Hill                   84.8        10.0       85.0  
#> 3 Oldtown/Middle East                   64.1         7.66      42.0  
#> 4 Sandtown-Winchester~                  58.6         7.42      -5.80
#> # A tibble: 4 x 4
#>   Community             FelonyPer1000inhabi~ density_perc change_perc
#>   <chr>                                <dbl>        <dbl>       <dbl>
#> 1 Clifton-Berea                         72.6         2.15        38.5
#> 2 Madison/East End                      81.7         3.59        25.1
#> 3 Midway/Coldstream                     63.6         2.03        20.8
#> 4 Poppleton/The Terrac~                 62.0         3.59       -14.6

We can check how felony evolved over time.

Felony_Yearly_evolution_map <- crime_data_with_areas %>%
  filter(Category=="Felony") %>% 
  count(year=floor_date(CrimeDateTime,"year")) %>% 
  ggplot(aes(year,n))+geom_line()+
  scale_x_date(limits = c(as.Date("2014-01-01"), as.Date("2020-12-31"))) +
  labs(title = "In Baltimore, Felony started to decrease as from 2017",subtitle=str_wrap(("This explains why we should be careful when considering CCTV effectiveness in deterring felonies"),width=80),x="Year",y="Felony occurences")

Felony_Yearly_evolution_map

regression10 <- lm(Felony_Evolution_VS_CCTV$change_perc~Felony_Evolution_VS_CCTV$density_perc)
summary(regression10)
#> 
#> Call:
#> lm(formula = Felony_Evolution_VS_CCTV$change_perc ~ Felony_Evolution_VS_CCTV$density_perc)
#> 
#> Residuals:
#>    Min     1Q Median     3Q    Max 
#> -41.06 -18.59   1.22  10.35  77.43 
#> 
#> Coefficients:
#>                                       Estimate Std. Error t value
#> (Intercept)                               6.78       4.41    1.54
#> Felony_Evolution_VS_CCTV$density_perc     3.32       1.49    2.23
#>                                       Pr(>|t|)  
#> (Intercept)                               0.13  
#> Felony_Evolution_VS_CCTV$density_perc     0.03 *
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 25.9 on 53 degrees of freedom
#>   (1 observation deleted due to missingness)
#> Multiple R-squared:  0.086,  Adjusted R-squared:  0.0688 
#> F-statistic: 4.99 on 1 and 53 DF,  p-value: 0.0298

We continue with misdemeanors.

#> K-means clustering with 5 clusters of sizes 1, 11, 13, 3, 28
#> 
#> Cluster means:
#>   MisdemeanorPerCapitaPerArea density_perc
#> 1                       3.335        3.510
#> 2                       0.544        1.417
#> 3                       0.481       -0.172
#> 4                       2.035        0.292
#> 5                      -0.774       -0.633
#> 
#> Clustering vector:
#>         Allendale/Irvington/S. Hilton 
#>                                     5 
#>       Beechfield/Ten Hills/West Hills 
#>                                     5 
#>                         Belair-Edison 
#>                                     3 
#>     Brooklyn/Curtis Bay/Hawkins Point 
#>                                     3 
#>                                Canton 
#>                                     4 
#>                     Cedonia/Frankford 
#>                                     5 
#>                           Cherry Hill 
#>                                     2 
#>             Chinquapin Park/Belvedere 
#>                                     5 
#>                   Claremont/Armistead 
#>                                     5 
#>                         Clifton-Berea 
#>                                     3 
#>               Cross-Country/Cheswolde 
#>                                     5 
#>              Dickeyville/Franklintown 
#>                                     5 
#>                  Dorchester/Ashburton 
#>                                     5 
#>                   Downtown/Seton Hill 
#>                                     1 
#>                     Edmondson Village 
#>                                     5 
#>                           Fells Point 
#>                                     3 
#>                  Forest Park/Walbrook 
#>                                     5 
#>                        Glen-Fallstaff 
#>                                     5 
#>       Greater Charles Village/Barclay 
#>                                     2 
#>                        Greater Govans 
#>                                     5 
#>                     Greater Mondawmin 
#>                                     3 
#>       Greater Roland Park/Poplar Hill 
#>                                     5 
#>                      Greater Rosemont 
#>                                     2 
#>                       Greenmount East 
#>                                     3 
#>                              Hamilton 
#>                                     5 
#>              Harbor East/Little Italy 
#>                                     4 
#>                      Harford/Echodale 
#>                                     5 
#>                          Highlandtown 
#>                                     5 
#>            Howard Park/West Arlington 
#>                                     5 
#>             Inner Harbor/Federal Hill 
#>                                     2 
#>                            Lauraville 
#>                                     5 
#>                            Loch Raven 
#>                                     5 
#>                      Madison/East End 
#>                                     2 
#>  Medfield/Hampden/Woodberry/Remington 
#>                                     5 
#>                               Midtown 
#>                                     2 
#>                     Midway/Coldstream 
#>                                     3 
#>              Morrell Park/Violetville 
#>                                     3 
#>           Mount Washington/Coldspring 
#>                                     5 
#>     North Baltimore/Guilford/Homeland 
#>                                     5 
#>                             Northwood 
#>                                     5 
#>                   Oldtown/Middle East 
#>                                     2 
#>         Orangeville/East Highlandtown 
#>                                     3 
#>           Patterson Park North & East 
#>                                     5 
#>             Penn North/Reservoir Hill 
#>                                     3 
#>             Pimlico/Arlington/Hilltop 
#>                                     3 
#> Poppleton/The Terraces/Hollins Market 
#>                                     2 
#>       Sandtown-Winchester/Harlem Park 
#>                                     2 
#>                       South Baltimore 
#>                                     5 
#>                          Southeastern 
#>                                     3 
#>                 Southern Park Heights 
#>                                     5 
#>                   Southwest Baltimore 
#>                                     2 
#>                         The Waverlies 
#>                                     3 
#>                   Upton/Druid Heights 
#>                                     2 
#>            Washington Village/Pigtown 
#>                                     4 
#>        Westport/Mount Winans/Lakeland 
#>                                     5 
#>                    Unassigned -- Jail 
#>                                     5 
#> 
#> Within cluster sum of squares by cluster:
#> [1] 0.00 7.29 3.78 1.47 4.58
#>  (between_SS / total_SS =  84.4 %)
#> 
#> Available components:
#> 
#> [1] "cluster"      "centers"      "totss"        "withinss"    
#> [5] "tot.withinss" "betweenss"    "size"         "iter"        
#> [9] "ifault"
#> # A tibble: 4 x 4
#>   Community           MisdemeanorPer1000inh~ density_perc change_perc
#>   <chr>                                <dbl>        <dbl>       <dbl>
#> 1 Downtown/Seton Hill                  150.         10.0        21.9 
#> 2 Oldtown/Middle East                   85.2         7.66       15.0 
#> 3 Sandtown-Wincheste~                   49.2         7.42      -10.7 
#> 4 Upton/Druid Heights                   73.4         5.74        7.74
#> # A tibble: 4 x 4
#>   Community                  MisdemeanorPer1~ density_perc change_perc
#>   <chr>                                 <dbl>        <dbl>       <dbl>
#> 1 Canton                                 93.1        0.239      -0.904
#> 2 Harbor East/Little Italy               70.2        3.47      -45.6  
#> 3 Penn North/Reservoir Hill              58.0        1.08        5.41 
#> 4 Washington Village/Pigtown            115.         3.71        7.16

We can check how misdemeanor evolved over time.

Misdemeanor_Yearly_evolution_map <- crime_data_with_areas %>%
  filter(Category=="Misdemeanor") %>% 
  count(year=floor_date(CrimeDateTime,"year")) %>% 
  ggplot(aes(year,n))+geom_line()+
  scale_x_date(limits = c(as.Date("2014-01-01"), as.Date("2020-12-31"))) +
  labs(title = "In Baltimore, Misdemeanor started to decrease as from 2017",subtitle=str_wrap(("This explains why we should be careful when considering CCTV effectiveness in deterring misdemenors"),width=80),x="Year",y="Misdemeanor occurences")

Misdemeanor_Yearly_evolution_map

regression11 <- lm(Misdemeanor_Evolution_VS_CCTV$change_perc~Misdemeanor_Evolution_VS_CCTV$density_perc)
summary(regression11)
#> 
#> Call:
#> lm(formula = Misdemeanor_Evolution_VS_CCTV$change_perc ~ Misdemeanor_Evolution_VS_CCTV$density_perc)
#> 
#> Residuals:
#>    Min     1Q Median     3Q    Max 
#> -49.48 -10.48  -0.37   7.52  55.28 
#> 
#> Coefficients:
#>                                            Estimate Std. Error
#> (Intercept)                                  -1.279      3.590
#> Misdemeanor_Evolution_VS_CCTV$density_perc    0.293      1.212
#>                                            t value Pr(>|t|)
#> (Intercept)                                  -0.36     0.72
#> Misdemeanor_Evolution_VS_CCTV$density_perc    0.24     0.81
#> 
#> Residual standard error: 21.1 on 53 degrees of freedom
#>   (1 observation deleted due to missingness)
#> Multiple R-squared:  0.0011, Adjusted R-squared:  -0.0177 
#> F-statistic: 0.0582 on 1 and 53 DF,  p-value: 0.81

4.2.2 CCTVs VS Violent and Property Crime

#> 
#> CCTV vs ViolentCrimePerArea
#> ================================================
#>                         Dependent variable:     
#>                     ----------------------------
#>                     ViolentCrimePerCapitaPerArea
#> ------------------------------------------------
#> density_perc                 53.700***          
#>                               (7.080)           
#>                                                 
#> Constant                     205.000***         
#>                               (20.800)          
#>                                                 
#> ------------------------------------------------
#> Observations                     56             
#> R2                             0.516            
#> Adjusted R2                    0.507            
#> Residual Std. Error      124.000 (df = 54)      
#> F Statistic            57.600*** (df = 1; 54)   
#> ================================================
#> Note:                *p<0.1; **p<0.05; ***p<0.01
#> 
#> <table style="text-align:center"><caption><strong>Property Crime vs CCTVs</strong></caption>
#> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td><em>Dependent variable:</em></td></tr>
#> <tr><td></td><td colspan="1" style="border-bottom: 1px solid black"></td></tr>
#> <tr><td style="text-align:left"></td><td>PropertyCrimePerCapitaPerArea</td></tr>
#> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">density_perc</td><td>37.400<sup>***</sup></td></tr>
#> <tr><td style="text-align:left"></td><td>(8.100)</td></tr>
#> <tr><td style="text-align:left"></td><td></td></tr>
#> <tr><td style="text-align:left">Constant</td><td>258.000<sup>***</sup></td></tr>
#> <tr><td style="text-align:left"></td><td>(23.800)</td></tr>
#> <tr><td style="text-align:left"></td><td></td></tr>
#> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>56</td></tr>
#> <tr><td style="text-align:left">R<sup>2</sup></td><td>0.284</td></tr>
#> <tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.270</td></tr>
#> <tr><td style="text-align:left">Residual Std. Error</td><td>141.000 (df = 54)</td></tr>
#> <tr><td style="text-align:left">F Statistic</td><td>21.400<sup>***</sup> (df = 1; 54)</td></tr>
#> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td style="text-align:right"><sup>*</sup>p<0.1; <sup>**</sup>p<0.05; <sup>***</sup>p<0.01</td></tr>
#> </table>

We proceed in the same way as we did with felony and misdemeanors.

#> K-means clustering with 5 clusters of sizes 7, 8, 6, 15, 20
#> 
#> Cluster means:
#>   ViolentCrimePerCapitaPerArea density_perc
#> 1                      -0.0722        0.613
#> 2                       1.3246        0.410
#> 3                       1.4711        2.299
#> 4                       0.0749       -0.522
#> 5                      -1.0021       -0.677
#> 
#> Clustering vector:
#>         Allendale/Irvington/S. Hilton 
#>                                     4 
#>       Beechfield/Ten Hills/West Hills 
#>                                     5 
#>                         Belair-Edison 
#>                                     4 
#>     Brooklyn/Curtis Bay/Hawkins Point 
#>                                     4 
#>                                Canton 
#>                                     4 
#>                     Cedonia/Frankford 
#>                                     4 
#>                           Cherry Hill 
#>                                     3 
#>             Chinquapin Park/Belvedere 
#>                                     5 
#>                   Claremont/Armistead 
#>                                     5 
#>                         Clifton-Berea 
#>                                     2 
#>               Cross-Country/Cheswolde 
#>                                     5 
#>              Dickeyville/Franklintown 
#>                                     5 
#>                  Dorchester/Ashburton 
#>                                     4 
#>                   Downtown/Seton Hill 
#>                                     3 
#>                     Edmondson Village 
#>                                     4 
#>                           Fells Point 
#>                                     1 
#>                  Forest Park/Walbrook 
#>                                     4 
#>                        Glen-Fallstaff 
#>                                     4 
#>       Greater Charles Village/Barclay 
#>                                     1 
#>                        Greater Govans 
#>                                     5 
#>                     Greater Mondawmin 
#>                                     2 
#>       Greater Roland Park/Poplar Hill 
#>                                     5 
#>                      Greater Rosemont 
#>                                     1 
#>                       Greenmount East 
#>                                     1 
#>                              Hamilton 
#>                                     5 
#>              Harbor East/Little Italy 
#>                                     2 
#>                      Harford/Echodale 
#>                                     5 
#>                          Highlandtown 
#>                                     5 
#>            Howard Park/West Arlington 
#>                                     5 
#>             Inner Harbor/Federal Hill 
#>                                     1 
#>                            Lauraville 
#>                                     5 
#>                            Loch Raven 
#>                                     5 
#>                      Madison/East End 
#>                                     2 
#>  Medfield/Hampden/Woodberry/Remington 
#>                                     5 
#>                               Midtown 
#>                                     1 
#>                     Midway/Coldstream 
#>                                     2 
#>              Morrell Park/Violetville 
#>                                     4 
#>           Mount Washington/Coldspring 
#>                                     5 
#>     North Baltimore/Guilford/Homeland 
#>                                     5 
#>                             Northwood 
#>                                     5 
#>                   Oldtown/Middle East 
#>                                     3 
#>         Orangeville/East Highlandtown 
#>                                     4 
#>           Patterson Park North & East 
#>                                     5 
#>             Penn North/Reservoir Hill 
#>                                     4 
#>             Pimlico/Arlington/Hilltop 
#>                                     2 
#> Poppleton/The Terraces/Hollins Market 
#>                                     2 
#>       Sandtown-Winchester/Harlem Park 
#>                                     3 
#>                       South Baltimore 
#>                                     5 
#>                          Southeastern 
#>                                     4 
#>                 Southern Park Heights 
#>                                     1 
#>                   Southwest Baltimore 
#>                                     3 
#>                         The Waverlies 
#>                                     4 
#>                   Upton/Druid Heights 
#>                                     3 
#>            Washington Village/Pigtown 
#>                                     2 
#>        Westport/Mount Winans/Lakeland 
#>                                     4 
#>                    Unassigned -- Jail 
#>                                     5 
#> 
#> Within cluster sum of squares by cluster:
#> [1] 1.57 1.97 4.76 3.02 2.50
#>  (between_SS / total_SS =  87.4 %)
#> 
#> Available components:
#> 
#> [1] "cluster"      "centers"      "totss"        "withinss"    
#> [5] "tot.withinss" "betweenss"    "size"         "iter"        
#> [9] "ifault"
#> # A tibble: 4 x 4
#>   Community           ViolentCrimePer1000in~ density_perc change_perc
#>   <chr>                                <dbl>        <dbl>       <dbl>
#> 1 Cherry Hill                           49.0         7.06       13.6 
#> 2 Downtown/Seton Hill                  123.         10.0        57.0 
#> 3 Oldtown/Middle East                   84.3         7.66       34.0 
#> 4 Sandtown-Wincheste~                   59.4         7.42       -4.86
#> # A tibble: 4 x 4
#>   Community                 ViolentCrimePer~ density_perc change_perc
#>   <chr>                                <dbl>        <dbl>       <dbl>
#> 1 Clifton-Berea                         84.7         2.15       26.6 
#> 2 Madison/East End                      83.0         3.59       21.3 
#> 3 Midway/Coldstream                     69.4         2.03       20.1 
#> 4 Pimlico/Arlington/Hilltop             58.9         1.20        5.14

We can check how violent crime evolved over time.

Violent_Crime_Yearly_evolution_map <- crime_data_with_areas %>%
  filter(VIO_PROP_CFS=="VIOLENT") %>% 
  count(year=floor_date(CrimeDateTime,"year")) %>% 
  ggplot(aes(year,n))+geom_line()+
  scale_x_date(limits = c(as.Date("2014-01-01"), as.Date("2020-12-31"))) +
  labs(title = "Overall, violent crime seems to have decreased for the 2017 to 2020 period",subtitle=str_wrap(("This explains why we should be careful when considering CCTV effectiveness in deterring violent crime"),width=80),x="Year",y="Violent crime occurences")

Violent_Crime_Yearly_evolution_map

regression8 <- lm(Violent_Crime_Evolution_VS_CCTV$change_perc~Violent_Crime_Evolution_VS_CCTV$density_perc)
summary(regression8)
#> 
#> Call:
#> lm(formula = Violent_Crime_Evolution_VS_CCTV$change_perc ~ Violent_Crime_Evolution_VS_CCTV$density_perc)
#> 
#> Residuals:
#>    Min     1Q Median     3Q    Max 
#>  -48.4  -21.5   -3.7   11.1   78.0 
#> 
#> Coefficients:
#>                                              Estimate Std. Error
#> (Intercept)                                    26.556      5.255
#> Violent_Crime_Evolution_VS_CCTV$density_perc    0.312      1.774
#>                                              t value Pr(>|t|)    
#> (Intercept)                                     5.05  5.5e-06 ***
#> Violent_Crime_Evolution_VS_CCTV$density_perc    0.18     0.86    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 30.9 on 53 degrees of freedom
#>   (1 observation deleted due to missingness)
#> Multiple R-squared:  0.000583,   Adjusted R-squared:  -0.0183 
#> F-statistic: 0.0309 on 1 and 53 DF,  p-value: 0.861

We continue with property crimes.

#> K-means clustering with 5 clusters of sizes 1, 3, 29, 16, 7
#> 
#> Cluster means:
#>   PropertyCrimePerCapitaPerArea density_perc
#> 1                         2.369        3.510
#> 2                         2.328        0.292
#> 3                        -0.741       -0.638
#> 4                         0.637        0.112
#> 5                         0.277        1.761
#> 
#> Clustering vector:
#>         Allendale/Irvington/S. Hilton 
#>                                     3 
#>       Beechfield/Ten Hills/West Hills 
#>                                     3 
#>                         Belair-Edison 
#>                                     4 
#>     Brooklyn/Curtis Bay/Hawkins Point 
#>                                     4 
#>                                Canton 
#>                                     2 
#>                     Cedonia/Frankford 
#>                                     3 
#>                           Cherry Hill 
#>                                     5 
#>             Chinquapin Park/Belvedere 
#>                                     3 
#>                   Claremont/Armistead 
#>                                     3 
#>                         Clifton-Berea 
#>                                     4 
#>               Cross-Country/Cheswolde 
#>                                     3 
#>              Dickeyville/Franklintown 
#>                                     3 
#>                  Dorchester/Ashburton 
#>                                     3 
#>                   Downtown/Seton Hill 
#>                                     1 
#>                     Edmondson Village 
#>                                     3 
#>                           Fells Point 
#>                                     4 
#>                  Forest Park/Walbrook 
#>                                     3 
#>                        Glen-Fallstaff 
#>                                     3 
#>       Greater Charles Village/Barclay 
#>                                     5 
#>                        Greater Govans 
#>                                     3 
#>                     Greater Mondawmin 
#>                                     4 
#>       Greater Roland Park/Poplar Hill 
#>                                     3 
#>                      Greater Rosemont 
#>                                     5 
#>                       Greenmount East 
#>                                     4 
#>                              Hamilton 
#>                                     3 
#>              Harbor East/Little Italy 
#>                                     2 
#>                      Harford/Echodale 
#>                                     3 
#>                          Highlandtown 
#>                                     3 
#>            Howard Park/West Arlington 
#>                                     3 
#>             Inner Harbor/Federal Hill 
#>                                     4 
#>                            Lauraville 
#>                                     3 
#>                            Loch Raven 
#>                                     3 
#>                      Madison/East End 
#>                                     4 
#>  Medfield/Hampden/Woodberry/Remington 
#>                                     3 
#>                               Midtown 
#>                                     4 
#>                     Midway/Coldstream 
#>                                     4 
#>              Morrell Park/Violetville 
#>                                     3 
#>           Mount Washington/Coldspring 
#>                                     3 
#>     North Baltimore/Guilford/Homeland 
#>                                     3 
#>                             Northwood 
#>                                     3 
#>                   Oldtown/Middle East 
#>                                     5 
#>         Orangeville/East Highlandtown 
#>                                     4 
#>           Patterson Park North & East 
#>                                     3 
#>             Penn North/Reservoir Hill 
#>                                     4 
#>             Pimlico/Arlington/Hilltop 
#>                                     4 
#> Poppleton/The Terraces/Hollins Market 
#>                                     4 
#>       Sandtown-Winchester/Harlem Park 
#>                                     5 
#>                       South Baltimore 
#>                                     3 
#>                          Southeastern 
#>                                     4 
#>                 Southern Park Heights 
#>                                     3 
#>                   Southwest Baltimore 
#>                                     5 
#>                         The Waverlies 
#>                                     4 
#>                   Upton/Druid Heights 
#>                                     5 
#>            Washington Village/Pigtown 
#>                                     2 
#>        Westport/Mount Winans/Lakeland 
#>                                     3 
#>                    Unassigned -- Jail 
#>                                     3 
#> 
#> Within cluster sum of squares by cluster:
#> [1] 0.00 1.88 6.92 6.30 3.81
#>  (between_SS / total_SS =  82.8 %)
#> 
#> Available components:
#> 
#> [1] "cluster"      "centers"      "totss"        "withinss"    
#> [5] "tot.withinss" "betweenss"    "size"         "iter"        
#> [9] "ifault"
#> # A tibble: 4 x 4
#>   Community           PropertyCrimePer1000i~ density_perc change_perc
#>   <chr>                                <dbl>        <dbl>       <dbl>
#> 1 Downtown/Seton Hill                   90.4        10.0        23.4 
#> 2 Oldtown/Middle East                   56.3         7.66       15.4 
#> 3 Sandtown-Wincheste~                   54.8         7.42      -11.8 
#> 4 Upton/Druid Heights                   57.3         5.74        4.31
#> # A tibble: 4 x 4
#>   Community                  PropertyCrimePe~ density_perc change_perc
#>   <chr>                                 <dbl>        <dbl>       <dbl>
#> 1 Canton                                 90.3        0.239       11.3 
#> 2 Harbor East/Little Italy              120.         3.47       -49.7 
#> 3 Southeastern                           67.8        0.239        5.23
#> 4 Washington Village/Pigtown            115.         3.71        -5.38

We can check how property crime evolved over time.

Property_Crime_Yearly_evolution_map <- crime_data_with_areas %>%
  filter(VIO_PROP_CFS=="PROPERTY") %>% 
  count(year=floor_date(CrimeDateTime,"year")) %>% 
  ggplot(aes(year,n))+geom_line()+
  scale_x_date(limits = c(as.Date("2014-01-01"), as.Date("2020-12-31"))) +
  labs(title = "Overall, property crime seems to have decreased for the 2017 to 2020 period",subtitle=str_wrap(("This explains why we should be careful when considering CCTV effectiveness in deterring property crime"),width=80),x="Year",y="Property crime occurences")

Property_Crime_Yearly_evolution_map

regression9 <- lm(Property_Crime_Evolution_VS_CCTV$change_perc~Property_Crime_Evolution_VS_CCTV$density_perc)
summary(regression9)
#> 
#> Call:
#> lm(formula = Property_Crime_Evolution_VS_CCTV$change_perc ~ Property_Crime_Evolution_VS_CCTV$density_perc)
#> 
#> Residuals:
#>    Min     1Q Median     3Q    Max 
#> -41.88 -14.23   0.36  10.35 111.30 
#> 
#> Coefficients:
#>                                               Estimate Std. Error
#> (Intercept)                                     -12.42       3.83
#> Property_Crime_Evolution_VS_CCTV$density_perc     1.33       1.29
#>                                               t value Pr(>|t|)   
#> (Intercept)                                     -3.25    0.002 **
#> Property_Crime_Evolution_VS_CCTV$density_perc    1.03    0.306   
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 22.5 on 53 degrees of freedom
#>   (1 observation deleted due to missingness)
#> Multiple R-squared:  0.0197, Adjusted R-squared:  0.00125 
#> F-statistic: 1.07 on 1 and 53 DF,  p-value: 0.306

4.3 Research question 3 - Is the impact of CCTV on crime reduction higher/lower/same in higher income neighborhoods compared to lower income neighborhoods?

As the actual effectiveness of CCTV is not so clear, answering our third research question does not really make sense. Yet what we can still investigate is how crime as evolved in each area based on its poverty level. Indeed, it is interesting to see whether, for example, inequalities in terms of security between richer and poorer areas have evolved over time.

#> 
#> Call:
#> lm(formula = Evolution_VS_poverty$hhpov19 ~ Evolution_VS_poverty$change)
#> 
#> Residuals:
#>    Min     1Q Median     3Q    Max 
#> -18.12  -8.70  -1.99   7.51  26.08 
#> 
#> Coefficients:
#>                             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)                    16.24       1.49   10.88  4.1e-15 ***
#> Evolution_VS_poverty$change    14.41       7.52    1.92    0.061 .  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 10.9 on 53 degrees of freedom
#>   (1 observation deleted due to missingness)
#> Multiple R-squared:  0.0648, Adjusted R-squared:  0.0471 
#> F-statistic: 3.67 on 1 and 53 DF,  p-value: 0.0608

The very poor R2 indicates that there is a poor correlation between the change in crime per capita between 2014 and 2019 and the poverty metric used.

4.4 Comparison of CCTV density and wealth

We went to see whether there is a correlation between CCTV density and wealth. One of our initial hypothesis was that the government respected more the privacy of wealthier people. So, similarly, we perform a regression. The results here are not so conclusive, since we have a poor \(adjusted R^2\) and a poor \(R^2\). The next sub-section illustrates this in a map.

#> 
#> CCTV vs poverty
#> ===============================================
#>                         Dependent variable:    
#>                     ---------------------------
#>                            density_perc        
#> -----------------------------------------------
#> hhpov19                      0.106***          
#>                               (0.024)          
#>                                                
#> Constant                       0.052           
#>                               (0.485)          
#>                                                
#> -----------------------------------------------
#> Observations                    56             
#> R2                             0.258           
#> Adjusted R2                    0.244           
#> Residual Std. Error       2.050 (df = 54)      
#> F Statistic           18.700*** (df = 1; 54)   
#> ===============================================
#> Note:               *p<0.1; **p<0.05; ***p<0.01

4.4.1 Mapping of CCTVs and wealth

The methodology to create the map is always the same: we ensure a perfect match, then merge the data using left_join and finally create the map using tmap. While the simple linear regression was not so conclusive, it seems like the map enables one to grasp interesting patterns. If we look at the map we see that at least those areas with no CCTVs are more likely to be quite wealthy. However, we are not sure whether this is the only influential factor here, thus we think it is rather correlated to crime per capita in these areas. Again, in the northern parts we see less CCTV, less crime, and also more wealthier population.

#>  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [14] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [27] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [40] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [53] TRUE TRUE TRUE TRUE

4.5 Comparison of crimes and wealth

We want to investigate whether or not wealthier areas are more or less impacted by crime. To do so, we once more compute a simple linear regression. We again see that the \(R^2\) is quite poor.

#> 
#> <table style="text-align:center"><caption><strong>CrimePer1000Inhabitants vs Poverty</strong></caption>
#> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td><em>Dependent variable:</em></td></tr>
#> <tr><td></td><td colspan="1" style="border-bottom: 1px solid black"></td></tr>
#> <tr><td style="text-align:left"></td><td>CrimePer1000inhabitants</td></tr>
#> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">hhpov19</td><td>14.200<sup>***</sup></td></tr>
#> <tr><td style="text-align:left"></td><td>(3.430)</td></tr>
#> <tr><td style="text-align:left"></td><td></td></tr>
#> <tr><td style="text-align:left">Constant</td><td>392.000<sup>***</sup></td></tr>
#> <tr><td style="text-align:left"></td><td>(68.200)</td></tr>
#> <tr><td style="text-align:left"></td><td></td></tr>
#> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>56</td></tr>
#> <tr><td style="text-align:left">R<sup>2</sup></td><td>0.242</td></tr>
#> <tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.228</td></tr>
#> <tr><td style="text-align:left">Residual Std. Error</td><td>288.000 (df = 54)</td></tr>
#> <tr><td style="text-align:left">F Statistic</td><td>17.300<sup>***</sup> (df = 1; 54)</td></tr>
#> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td style="text-align:right"><sup>*</sup>p<0.1; <sup>**</sup>p<0.05; <sup>***</sup>p<0.01</td></tr>
#> </table>

4.6 Felonies VS Misdemeanors - Do we have an equal crime type distribution?

It is always interesting to see whether we can spot patterns in crime date. The idea here is to analyse whether we tend to observe an equal distribution of felony and misdemeanors in each area. By computing a simple linear regression, we see that the two types of crime seems rather equally distributed in each area. Still, it is interesting to observe that the biggest outline on the scatter plot is Downtown/Seton Hill. In Downtown, misdemeanor per capita is much larger than the felony per capita We don’t whether this finding is relevant, yet, it must be mentioned that this area also is one of the richest area in Baltimore. This might suggest the idea that richer areas are more impacted by less severe crimes.

#> 
#> Felony vs Misdemeanor
#> =======================================================
#>                                 Dependent variable:    
#>                             ---------------------------
#>                               FelonyPerCapitaPerArea   
#> -------------------------------------------------------
#> MisdemeanorPerCapitaPerArea          0.616***          
#>                                       (0.064)          
#>                                                        
#> Constant                             81.200***         
#>                                      (25.000)          
#>                                                        
#> -------------------------------------------------------
#> Observations                            56             
#> R2                                     0.629           
#> Adjusted R2                            0.622           
#> Residual Std. Error              92.800 (df = 54)      
#> F Statistic                   91.700*** (df = 1; 54)   
#> =======================================================
#> Note:                       *p<0.1; **p<0.05; ***p<0.01

4.7 Attempt to create a more accurate model: Multiple Regression

#> Start:  AIC=96.9
#> Community_data$density_perc ~ 1
#> 
#>                                               Df Sum of Sq RSS  AIC
#> + Community_data$ViolentCrimePerCapitaPerArea  1     157.3 147 58.2
#> + Community_data$hhpov19                       1      78.5 226 82.2
#> + Community_data$hsagov14                      1      61.6 243 86.2
#> + Community_data$nohhint19                     1      19.4 285 95.2
#> + Community_data$unempr19                      1      16.0 289 95.8
#> + Community_data$pwhite20                      1      10.9 294 96.8
#> <none>                                                     305 96.9
#> + Community_data$paa20                         1      10.6 294 96.9
#> + Community_data$ready13                       1       0.9 304 98.7
#> 
#> Step:  AIC=58.2
#> Community_data$density_perc ~ Community_data$ViolentCrimePerCapitaPerArea
#> 
#>                            Df Sum of Sq RSS  AIC
#> + Community_data$hsagov14   1      8.25 139 57.0
#> <none>                                  147 58.2
#> + Community_data$unempr19   1      4.46 143 58.5
#> + Community_data$ready13    1      3.03 144 59.0
#> + Community_data$nohhint19  1      2.66 145 59.2
#> + Community_data$hhpov19    1      2.24 145 59.3
#> + Community_data$paa20      1      1.63 146 59.6
#> + Community_data$pwhite20   1      1.38 146 59.7
#> 
#> Step:  AIC=57
#> Community_data$density_perc ~ Community_data$ViolentCrimePerCapitaPerArea + 
#>     Community_data$hsagov14
#> 
#>                            Df Sum of Sq RSS  AIC
#> + Community_data$pwhite20   1     13.02 126 53.5
#> + Community_data$unempr19   1      6.58 133 56.3
#> + Community_data$nohhint19  1      5.09 134 56.9
#> <none>                                  139 57.0
#> + Community_data$paa20      1      4.49 135 57.1
#> + Community_data$hhpov19    1      0.43 139 58.8
#> + Community_data$ready13    1      0.03 139 59.0
#> 
#> Step:  AIC=53.5
#> Community_data$density_perc ~ Community_data$ViolentCrimePerCapitaPerArea + 
#>     Community_data$hsagov14 + Community_data$pwhite20
#> 
#>                            Df Sum of Sq RSS  AIC
#> + Community_data$paa20      1      6.70 119 52.4
#> + Community_data$hhpov19    1      4.72 121 53.3
#> <none>                                  126 53.5
#> + Community_data$ready13    1      0.72 125 55.2
#> + Community_data$unempr19   1      0.04 126 55.5
#> + Community_data$nohhint19  1      0.01 126 55.5
#> 
#> Step:  AIC=52.4
#> Community_data$density_perc ~ Community_data$ViolentCrimePerCapitaPerArea + 
#>     Community_data$hsagov14 + Community_data$pwhite20 + Community_data$paa20
#> 
#>                            Df Sum of Sq RSS  AIC
#> <none>                                  119 52.4
#> + Community_data$hhpov19    1      3.62 116 52.7
#> + Community_data$unempr19   1      0.68 119 54.1
#> + Community_data$nohhint19  1      0.31 119 54.3
#> + Community_data$ready13    1      0.00 119 54.4
#> 
#> Community data vs
#> ========================================================
#>                                  Dependent variable:    
#>                              ---------------------------
#>                                     density_perc        
#> --------------------------------------------------------
#> ViolentCrimePerCapitaPerArea          0.008***          
#>                                        (0.001)          
#>                                                         
#> hsagov14                              -0.068***         
#>                                        (0.020)          
#>                                                         
#> pwhite20                               0.065**          
#>                                        (0.025)          
#>                                                         
#> paa20                                  0.028*           
#>                                        (0.016)          
#>                                                         
#> Constant                                0.131           
#>                                        (1.270)          
#>                                                         
#> --------------------------------------------------------
#> Observations                             56             
#> R2                                      0.608           
#> Adjusted R2                             0.577           
#> Residual Std. Error                1.530 (df = 51)      
#> F Statistic                    19.800*** (df = 4; 51)   
#> ========================================================
#> Note:                        *p<0.1; **p<0.05; ***p<0.01
#> Community_data$ViolentCrimePerCapitaPerArea 
#>                                        1.35 
#>                     Community_data$hsagov14 
#>                                        2.87 
#>                     Community_data$pwhite20 
#>                                        9.33 
#>                        Community_data$paa20 
#>                                        6.48
#> 
#> Call:
#> lm(formula = Community_data$paa20 ~ Community_data$pwhite20)
#> 
#> Coefficients:
#>             (Intercept)  Community_data$pwhite20  
#>                   85.95                    -1.13
#> 
#> Call:
#> lm(formula = Community_data$paa20 ~ Community_data$pwhite20)
#> 
#> Residuals:
#>    Min     1Q Median     3Q    Max 
#> -85.95  -3.17   6.18   7.89  12.24 
#> 
#> Coefficients:
#>                         Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)              85.9489     3.0104    28.6   <2e-16 ***
#> Community_data$pwhite20  -1.1264     0.0857   -13.2   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 15.7 on 54 degrees of freedom
#> Multiple R-squared:  0.762,  Adjusted R-squared:  0.758 
#> F-statistic:  173 on 1 and 54 DF,  p-value: <2e-16
#> 
#> Call:
#> lm(formula = Community_data$density_perc ~ Community_data$pwhite20 + 
#>     Community_data$paa20)
#> 
#> Coefficients:
#>             (Intercept)  Community_data$pwhite20  
#>                 1.66600                 -0.01050  
#>    Community_data$paa20  
#>                 0.00666
#> 
#> Call:
#> lm(formula = Community_data$density_perc ~ Community_data$pwhite20 + 
#>     Community_data$paa20)
#> 
#> Residuals:
#>    Min     1Q Median     3Q    Max 
#>  -2.23  -1.66  -0.85   1.41   8.49 
#> 
#> Coefficients:
#>                         Estimate Std. Error t value Pr(>|t|)
#> (Intercept)              1.66600    1.80406    0.92     0.36
#> Community_data$pwhite20 -0.01050    0.02623   -0.40     0.69
#> Community_data$paa20     0.00666    0.02033    0.33     0.74
#> 
#> Residual standard error: 2.35 on 53 degrees of freedom
#> Multiple R-squared:  0.0379, Adjusted R-squared:  0.00155 
#> F-statistic: 1.04 on 2 and 53 DF,  p-value: 0.36
#> 
#> Call:
#> lm(formula = Community_data$density_perc ~ Community_data$pwhite20)
#> 
#> Coefficients:
#>             (Intercept)  Community_data$pwhite20  
#>                   2.238                   -0.018
#> 
#> Call:
#> lm(formula = Community_data$density_perc ~ Community_data$pwhite20)
#> 
#> Residuals:
#>    Min     1Q Median     3Q    Max 
#> -2.174 -1.609 -0.928  1.385  8.426 
#> 
#> Coefficients:
#>                         Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)               2.2381     0.4459    5.02    6e-06 ***
#> Community_data$pwhite20  -0.0180     0.0127   -1.42     0.16    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.33 on 54 degrees of freedom
#> Multiple R-squared:  0.0359, Adjusted R-squared:  0.0181 
#> F-statistic: 2.01 on 1 and 54 DF,  p-value: 0.162
#> 
#> Call:
#> lm(formula = Community_data$density_perc ~ Community_data$paa20)
#> 
#> Coefficients:
#>          (Intercept)  Community_data$paa20  
#>               0.9925                0.0138
#> 
#> Call:
#> lm(formula = Community_data$density_perc ~ Community_data$paa20)
#> 
#> Residuals:
#>    Min     1Q Median     3Q    Max 
#> -2.229 -1.698 -0.905  1.362  8.535 
#> 
#> Coefficients:
#>                      Estimate Std. Error t value Pr(>|t|)
#> (Intercept)           0.99252    0.64731    1.53     0.13
#> Community_data$paa20  0.01376    0.00984    1.40     0.17
#> 
#> Residual standard error: 2.33 on 54 degrees of freedom
#> Multiple R-squared:  0.0349, Adjusted R-squared:  0.0171 
#> F-statistic: 1.96 on 1 and 54 DF,  p-value: 0.168
#> Start:  AIC=96.9
#> Community_data$density_perc ~ 1
#> 
#>                                               Df Sum of Sq RSS  AIC
#> + Community_data$ViolentCrimePerCapitaPerArea  1     157.3 147 58.2
#> + Community_data$hhpov19                       1      78.5 226 82.2
#> + Community_data$hsagov14                      1      61.6 243 86.2
#> + Community_data$nohhint19                     1      19.4 285 95.2
#> + Community_data$unempr19                      1      16.0 289 95.8
#> <none>                                                     305 96.9
#> + Community_data$ready13                       1       0.9 304 98.7
#> 
#> Step:  AIC=58.2
#> Community_data$density_perc ~ Community_data$ViolentCrimePerCapitaPerArea
#> 
#>                            Df Sum of Sq RSS  AIC
#> + Community_data$hsagov14   1      8.25 139 57.0
#> <none>                                  147 58.2
#> + Community_data$unempr19   1      4.46 143 58.5
#> + Community_data$ready13    1      3.03 144 59.0
#> + Community_data$nohhint19  1      2.66 145 59.2
#> + Community_data$hhpov19    1      2.24 145 59.3
#> 
#> Step:  AIC=57
#> Community_data$density_perc ~ Community_data$ViolentCrimePerCapitaPerArea + 
#>     Community_data$hsagov14
#> 
#>                            Df Sum of Sq RSS  AIC
#> + Community_data$unempr19   1      6.58 133 56.3
#> + Community_data$nohhint19  1      5.09 134 56.9
#> <none>                                  139 57.0
#> + Community_data$hhpov19    1      0.43 139 58.8
#> + Community_data$ready13    1      0.03 139 59.0
#> 
#> Step:  AIC=56.3
#> Community_data$density_perc ~ Community_data$ViolentCrimePerCapitaPerArea + 
#>     Community_data$hsagov14 + Community_data$unempr19
#> 
#>                            Df Sum of Sq RSS  AIC
#> + Community_data$hhpov19    1      6.58 126 55.4
#> <none>                                  133 56.3
#> + Community_data$nohhint19  1      0.40 132 58.1
#> + Community_data$ready13    1      0.15 132 58.2
#> 
#> Step:  AIC=55.4
#> Community_data$density_perc ~ Community_data$ViolentCrimePerCapitaPerArea + 
#>     Community_data$hsagov14 + Community_data$unempr19 + Community_data$hhpov19
#> 
#>                            Df Sum of Sq RSS  AIC
#> <none>                                  126 55.4
#> + Community_data$nohhint19  1     1.969 124 56.5
#> + Community_data$ready13    1     0.005 126 57.4
#> 
#> Call:
#> lm(formula = Community_data$density_perc ~ Community_data$ViolentCrimePerCapitaPerArea + 
#>     Community_data$hsagov14 + Community_data$unempr19 + Community_data$hhpov19)
#> 
#> Residuals:
#>    Min     1Q Median     3Q    Max 
#> -3.010 -1.083 -0.269  0.946  3.436 
#> 
#> Coefficients:
#>                                             Estimate Std. Error
#> (Intercept)                                  0.83786    1.15304
#> Community_data$ViolentCrimePerCapitaPerArea  0.00856    0.00155
#> Community_data$hsagov14                     -0.02161    0.01413
#> Community_data$unempr19                     -0.12576    0.05540
#> Community_data$hhpov19                       0.04932    0.03023
#>                                             t value Pr(>|t|)    
#> (Intercept)                                    0.73    0.471    
#> Community_data$ViolentCrimePerCapitaPerArea    5.50  1.2e-06 ***
#> Community_data$hsagov14                       -1.53    0.132    
#> Community_data$unempr19                       -2.27    0.027 *  
#> Community_data$hhpov19                         1.63    0.109    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 1.57 on 51 degrees of freedom
#> Multiple R-squared:  0.586,  Adjusted R-squared:  0.554 
#> F-statistic: 18.1 on 4 and 51 DF,  p-value: 2.66e-09
#> Community_data$ViolentCrimePerCapitaPerArea 
#>                                        1.67 
#>                     Community_data$hsagov14 
#>                                        1.34 
#>                     Community_data$unempr19 
#>                                        1.93 
#>                      Community_data$hhpov19 
#>                                        2.60
#> 
#> Call:
#> lm(formula = Community_data$density_perc ~ Community_data$ViolentCrimePerCapitaPerArea + 
#>     Community_data$unempr19)
#> 
#> Residuals:
#>    Min     1Q Median     3Q    Max 
#> -3.047 -0.964 -0.367  0.919  4.197 
#> 
#> Coefficients:
#>                                             Estimate Std. Error
#> (Intercept)                                 -0.81792    0.49371
#> Community_data$ViolentCrimePerCapitaPerArea  0.01046    0.00142
#> Community_data$unempr19                     -0.06060    0.04711
#>                                             t value Pr(>|t|)    
#> (Intercept)                                   -1.66      0.1    
#> Community_data$ViolentCrimePerCapitaPerArea    7.35  1.2e-09 ***
#> Community_data$unempr19                       -1.29      0.2    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 1.64 on 53 degrees of freedom
#> Multiple R-squared:  0.531,  Adjusted R-squared:  0.513 
#> F-statistic:   30 on 2 and 53 DF,  p-value: 1.95e-09

4.67 Next steps after feedback on project update

  • Adjust accroding to the feedback
  • Create new models (potentially multiple linear regression ?)
  • Finalise interpetations and answer research questions
  • Compare results to other researches
  • Create some more visualisations (if useful and needed)
  • Include Executive summary at the beginning
  • Add additional data if needed
  • Create bibliography
  • hello guys

Conclusion

To put it all in a nutshell, our analysis shows quite well that CCTV placement seems to follow areas where crime per capita is highest. Although this is not conclusive evidence, our observation goes against the idea that CCTVs are effective crime deterrents. Regarding crime types our analysis and results, it therefore does not seem like the presence of CCTV has a particularly strong impact on a certain type of crime. MOreover, one of our initial hypothesis was that the government respected more the privacy of wealthier people, which turned out to be proven wrongly by data. However, we are not sure whether wealth is the only influential factor in this. Regarding the distribution and pattern of crime types in Baltimore, we see that in Downtown, misdemeanor per capita is much larger than the felony per capita and it must be mentioned that this area also is one of the richest area in Baltimore. This suggests that richer areas are more impacted by less severe crimes.

Limitations: Our limitations surround the data, the methodology and the selection bias for our research. First, we see that we have same limitations in the datasets, i.e. in the CCTV dataset the “CAM_NUM” column suggests that not all CCTVS are included in our data set. Second, we used simple regression models to show the effects of certain criteria on our variables of interest. Here of course there are more advanced ways to model and predict some effects on different datasets.

Future work: We only involved in descriptive data analysis, without doing any concrete modelling on some external datasets. One advancement and proposal for future work is to build a model on our Baltimore dataset and treat it as test set, and then apply this model on another data set of a different city to see whether we could predict “where the next crimes would happen and where CCTVs would have a big impact”. Here, we could create trees, or random forests for this prediction. Moreover, one could also study whether it is really the CCTVs which determine crime or whether there are alternatives to CCTVs. Maybe nudging people in doing good through showing some poster with some motivational verses - i.e. “Do to others how you yourself want to be treated”- would have a higher impact and would be overall cheaper.